|
|
-
I guess I haven't written in a while. I had more to say on the last topic, but I will have to come back to it. At the moment, my head is full of ideas inspired partly by a new D Programming Language feature (http://www.digitalmars.com/d/lazy-evaluation.html) and partly by this article ("Enforcements" http://www.ddj.com/dept/cpp/184403864) which I found for the first time a few days ago when I was reading about D's lazy evaluation (I know, it's an old article).
The Enforcements article gives extremely good coverage of how to use templates and macros in C++ to implement a rich "idiom" of code that enforces various conditions at runtime. The article on D's lazy evaluation shows among other things how this feature can be used to improve upon the implementation of the enforcement idiom in D. I want to take this a bit further and look at how readability and writeability of such idioms could be improved by building support for them into the language from the beginning. Here are some initial observations and assumptions I have when approaching this problem:
- Asserts are very commonly used and useful to developers, enough so on both points that I believe it makes sense to consider special support for them in the language. This is not entirely unprecedented in modern language design ("assert" is a keyword in both Java and D for example).
- From my experience, asserts can have a wide range of behavior between projects and occasionally even within projects. If asserts in the language are not rich enough to enable this wide range of behaviors, many developers will not use them in favor of something hand-rolled.
- Despite the wide range of behavior, asserts have fairly consistent semantics: if this fails, it is unexpected.
- Asserts can have special meaning to analysis tools and even to the optimizer.
In addition, I was thinking about a broader goal:
- A modern language should not need a preprocessor. Many things that can be done with a preprocessor that can't be replicated with templates/mixins/inline functions are used almost exclusively for asserts and similar error detection/handling/control idioms. By supporting these functions in a more intuitive way, specifically for asserts if necessary, we can knock a couple more things off the list of "why it is useful to have a preprocessor".
Among the preprocessor functions/tricks I am referring to are:
- Coverting a macro parameter to a string (the # preprocessor operator in C and C++). Assert macros often do this to turn the assert condition into a debugging message (and believe me, this is extremely helpful when debugging).
- Having access to predefined macros such as __LINE__, __FUNCTION__, etc. at the point of expansion.
- Being able to goto a label if that is how you do your error handling (I'm sure I will have a post on this later). Actually, to be honest I don't think I've seen too much of this type of thing in an assert macro, but I have thought of using it myself, to provide a safe fallback to asserts and to help satisfy analysis tools.
I'll outline some of my ideas in the next post.
|
-
I believe there is some need for method invocations/function calls (depending on what paradigm you are working within) to be programmatically accessible as objects, even in an unmanaged language. One important reason is debugging support.
Consider the situation where your code is a framework or middle layer that calls client code at the request of other client code. Sometimes a failure of some kind will occur in your code, not due to a fault in your code, but because the client code didn't follow the rules. While it is considered good programming practice in managed code to make your code robust against this situation, in unmanaged code, practical experience has shown that 1) this is not always possible, and 2) even when possible, this is not always a good idea because you will often lose valuable debugging information. Anyway, without getting too far off topic, suffice it to say that for certain types of code, it is not rare for a failure in your code to be the fault of someone else's code. In such a case, debugging support is a worthwhile nonfunctional requirement.
If you are aware of particular things that client code can do wrong that will make such-and-such a function in your code fail, it can be tremendously helpful, both to support personnel and also to developers of client code, to provide a debugger extension that analyzes the problem and figures out more specifically what is the likely ultimate cause. This may require looking at local variables in your function, the call stack, or other data specific to the function call.
|
-
Fortran has an interesting (and somewhat unusual) construct known as "arithmetic if". It looks something like this:
IF (arithmetic expression) label1, label2, label3
If the arithmetic expression evaluates to less than zero, control jumps to label1, equal to zero to label2, and greater than zero to label3. This construct is most often used to compare two numbers, in which case the arithmetic expression is of the form (x-y).
Arithmetic if fell out of favor with the advent of structured programming and is hardly ever used now. Since it is a disguised goto, all of the arguments against goto apply to arithmetic if as well. In addition, the syntax is considered by many modern programmers to be one of Fortran's particularly arcane constructs.
I don't think there is anything wrong with the idea of arithmetic if, however. Think of how often you write something like the following:
if (x < y) { ... } else if (x == y) { ... } else { ... }
This has the advantage of being structured and of using standard control flow constructs, but I don't think it is any more readable than the old arithmetic if. Here are some issues:
- Without reading the complex statement in its entirety, it is not clear that this is the equivalent of an arithmetic if.
- From a readability standpoint, there isn't a clear way to handle the final else statement. Consider for the moment integer comparisons, where <, ==, and > cover all the bases. You could write the final else as written above, in which case it stands out as different from the others in not having an if comparison (and because of this, you have to read the entire construct to know that this handles the > case). Or you could write the final else as "else if (x >y)", which at first glance may appear to be letting something slip through the cracks, as there is no final unqualified else.
- For floating point comparisons, these two solutions are not equivalent.
Besides readability, there are a couple other problems:
- There isn't a simple way to verify that there are no overlapping cases or missing cases. This concern is biggest for floating point comparisons, where there exist several not mutually exclusive comparison predicates.
- This construct is particularly well suited to optimization. However, as an if...else if...else construct, the optimizer needs to look harder to find it.
I want to resurrect arithmetic if in a more readable form. The most reasonable approach seems to be a special type of switch statement:
switch (x ? y)
{
case <:
...
case ==:
...
case >:
...
}
(I apologize for the spacing; it appears this blog software is not very friendly to code)
This approach has the additional benefit of being able to take advantage of the default case policy of the language (I personally like D's implicit assert on default for missing default statements).
This construct could be more flexible than the example I have given above. There is no reason, for example, that there couldn't be "case <=", "case !=", "case unordered", etc., as long as the cases don't overlap.
|
-
I was thinking the other day about certain computationally intensive numerical problems, particularly matrix multiplication, finding the eigenvalues of a matrix, and the Fast Fourier Transform, and how choosing the right algorithm depends on such intricasies as multiprocessor architecture, cache performance, etc. Actually, "choosing the right algorithm" is more a more difficult task than it sounds, as I will explain. For each of these problems, there exists a divide and conquer algorithm with an asymptotically better performance than the straightforward approach. However, the divide and conquer algorithm performs worse for problem sizes below a certain threshold. Therefore, it is common practice to use the divide and conquer algorithm down to a certain size, and then switch to the straightforward algorithm. To give a more familiar example for those who haven't done much numerical programming, think of quicksort. Usually it makes sense to use quicksort until the problem size is small enough and then use bubble sort.
There are important differences between quicksort and the numerical problems I mentioned above:
1. These numerical problems are highly parallelizable, making multiprocessor architecture particularly relevant in choosing the right algorithm.
2. They generally deal with huge amounts of data, making cache performance (and those other variables that affect it, such as cache size, available memory, memory latency, etc.) relevant.
3. They are very computationally intensive, making even small improvements in performance welcome.
Let's pick one of these problems and look at it in more detail: matrix multiplication. The straightforward algorithm for matrix multiplication multiplies two n×n matrices in O(n3) time. There is another algorithm, the Strassen algorithm, that does it in O(n2.807) time, but for small matrices it is generally much slower (Strassen's algorithm is also less numerically stable than the straightforward algorithm, but sometimes this is acceptable, and for some classes of matrices, it may be known in advance that this numerical instability has a much lesser effect than the general case). Strassen's algorithm is divide and conquer, so it is possible to use it down to a certain size and then switch over. What is that size? This is highly dependent on the factors mentioned above. Generally, to get the best possible performance, it will be necessary to do some profiling and analyze the results to find the optimum cutoff size.
There are a few possibilities of how to get the data from profiling into the program or library. In languages where it is feasible to do so, you could generate some code (probably something pretty simple) that captures this data, and include it into your program. For example, in C++, you could generate an include file containing:
#define STRASSEN_CUTOFF 10000
and whatever other code is necessary to make it work. The problem with this approach is that if you wish to port the solution to another computer with the same architecture but a different hardware configuration, or even upgrade the same computer, for example adding more memory, you need to recompile the library to get the best performance.
Another possibility is to get the cutoff value from a configuration file, which is read at run time. The problem with this is that you lose some potential for certain optimizations. For example, there may be optimizations possible for the straightforward algorithm that depend on knowing the maximum size of the input.
With load-time configuration information, combined with just-in-time compiling, you would get the best of both worlds. For example, the provider of the libarary could provide a large number of configuration files that are optimized for certain hardware configurations, and select one with a tool based on the particular machine's configuration, or it could even provide a tool that generates the configuration file from scratch. To get these benefits, however, I believe you really do need the language to provide such a configuration feature natively.
|
-
I want to step back and look at some more specific features. One feature that can be highly controversial in language design (and the evolution of existing languages) is operator overloading. Some languages have no operator overloading and a fairly limited number of operators. Many allow overloading of most existing operators but don't allow new operators to be defined. A few languages allow new operators to be defined.
The biggest concern with operator overloading for the languages that disallow it seems to be misuse leading to poor readability (this is, for example, why Java does not have operator overloading, despite many requests for the feature). In the most general case, there isn't much that can be done to prevent misuse of a feature such as operator overloading. However, it would be helpful if the core libraries for the language set a good example. To give an example of what I mean, in my opinion, C++'s use of the << and >> operators with iostreams to mean "put into a stream" and "get from a stream" would be confusing for someone who is new to C++ but is already familiar with the << and >> operators as shift left/shift right. Certainly these operators for iostreams have nothing to do with their usual usage. This is not to say classes such as the iostream classes should not have a way to do their work with operators, they just shouldn't use semantically unrelated operators (I actually don't like iostream and don't believe it has proven itself, but this is another issue).
There are a couple of things that can be done, however, to prevent gross misuse. I think the most elegant solution I have seen so far is in the D programming language. D overloaded operators are not defined with the "operator ==" syntax, but via specific function names that correspond to one or more operators. For example, the "opCmp" function handles <, <=, >, and >=. I'm not quite convinced I like this particular syntax, but something like it may be unavoidable. Anyway, what I like about this solution is that
- It is not necessary to write four separate boilerplate functions.
- It is generally not possible to abuse operator overloading for purposes semantically incompatible with the original use of the operators.
I say "generally not possible" because there are still edge cases where operator overloading may be misused.
We can perhaps do a little better, although it will rely upon other language features. Most operators have the property that when they are used in an expression, they have no side effects on the operands. This is important for optimization when dealing with the basic types of the language. If we could specify on a method of a class that it has no side effects, then we could do the same thing with optimization of expressions using overloaded operators. Any why not? This allows code heavily using overloaded operators to be written in the most readable way instead of the programmer having to manually elimate common subexpressions, etc., to get the best performance (and this is very important for many common operator overloading usage scenarios).
The issue of whether to allow new operators to be defined is a tricky one. There are plenty of usage scenarios, particularly in mathematical and scientific programming, where the existing operators are deficient. Consider the case of a vector (in the mathematical/physical sense, not an array list). How do you define operators for dot products and cross products?
I know of a couple languages where nearly arbitrary new operators may be defined, J and Scala. J, although it is a fascinating language, is probably not a good model for operator overloading in a general purpose langauge, as it pulls off its tricks by being an interpreted language. And with Scala, I get the feeling that the difficulty of writing compilers for a language using their model of operator overloading was pushed to the side to prove the point that it can be done.
Here is a better solution in my opinion: Unicode offers a large number of mathematical symbols that could be used as operators (and unambiguously so as well, since they reside in well defined blocks in the code tables). At least two of these are in my opinion a must for a language that can provide useful operator overloading to scientific programmers, × and · (to handle the common vector case brought up above). There is no reason, however, that more of these symbols could not be made available as operators, even though they would be undefined for basic types. To make this feasible to implement, associativity and precedence would probably need to be left undefined (except for × and ·, which should be the same as *), so that programmers would need to clarify their expressions with parantheses. This is not necessarily a bad thing for readability anyway, and may be a good idea for some of the language-defined operators as well. Some operators are often misused in languages like C++ because their precedence is misunderstood, and some compilers for these languages offer warnings to check for this.
One final note: not every programmer will have a fancy IDE available, and given that × and · are not on standard keyboards, it may be a pain to have to type in these symbols. I prefer to solve this problem with a pseudo-TeX syntax (which would be familiar to most mathematical/scientific programmers anyway). × and · would be aliased with \cross and \dot, respectively, which shouldn't be a problem for parsing since I don't plan on using \ for anything else anyway (outside of string literals, that is). Similar aliases could be defined for other operator symbols not available on keyboards if necessary, and although the two examples I have given don't follow the TeX notation exactly (because I want them to convey their meanings as they would be used in programs), any others probably should follow TeX convention.
|
-
I think it is probably a good time to give a couple of motivating examples for my idea of a rich intermediate language. Both examples ultimately have to do with optimization, and are therefore more matters of performance than correctness. However, I believe that my solution also provides a run-time model that is intuitively what we expect from certain types of program behavior.
Often, applications have behavior that is customizable by setting certain properties in a properties file (such as those ubiquitous .properties files that Java application often use, or XML-based configuration files), the Windows registry, environment variables, command-line args, etc. These behavior customizations are semantically related to conditional compilation, except the customizations occur at run-time instead of compile-time. Or so it seems. While the case could be made that the customizations technically occur at run-time, in fact they often conceptually occur at load-time. So for instance, when a Java class is loaded and its static initializer runs, a .properties file is read and the relevant properties in this file are stored as static variables of the class (or perhaps rather than attempting to pull out all relevant properties in the static initializer, the Properties object itself is stored). From this point on in many applications, these properties never change. Therefore, although the loading of properties occurs at run-time due to the limitations of the platform, it happens once-per-load and it happens before any other code in the class executes. This is my justification for calling the process conceptually a load-time process.
Let's step back for a second and look at conditional compilation. Why would conditional compilation be used at all, given that a similar but more flexible solution is available in setting properties in configuration files and the like? One reason is that a company may produce two or more versions of a similar product with different functionality, but that share a large portion of their code bases. It may be undesirable to make it easy for a customer to get functionality they didn't pay for by tweaking the configuration files. This particular concern is orthogonal to the conclusion I am trying to get at, but I want to make sure I don't ignore it. Another reason is that dynamic checks such as:
if (featureXEnabled) { ... }
can be a performance drain because
- The check itself takes time, albeit a small amount of time, that can nonetheless hurt performance significantly if it appears in a critical code path.
- The code for disabled features, while essentially dead code, still exists in the binary and therefore can increase the need for page swapping.
- Some potential for optimization is lost. In unmanaged code, the potential is irrevocably lost. In managed code, due to limitations of the optimizations that are done at load-time, because it is often difficult for the virtual machine to discover that these optimizations are possible, and because sometimes the code is written in such a way that it is impossible for the virtual machine to safely make the optimization, the potential is still usually lost.
Therefore, even if it would be beneficial from a business standpoint to make the behavior dynamically customizable, it is often necessary for performance reasons to use conditional compilation (or, in the case of languages such as Java that don't allow for conditional compilation, to develop in parallel two versions of the same application).
If there were a standard way for loading properties, one that the loader could be aware of, then it would be easy (easier at least) for the loader to make these kinds of optimizations as it loads the application/module. To make the discovery feasible, it would be useful to have this information specified explicitly in the intermediate code that is loaded, so that instead of the "if" statement above, the code in the statement would be tagged with "this block of code gets executed if and only if feature X is enabled". This requires some syntactic structure of the intermediate code.
My second example is dynamic class loading, used to select an implementation for a particular interface at run time. I will admit, this is an area where managed languages excel. My biggest complaint is that reflection, being based on calls to APIs using primarily strings to identify classes, provides little in the way of verification ability at either compile-time or load-time, but I will get back to this in a later post. My point is that, much as in the case of behavior being determined by properties in a configuration file that don't change once they are loaded, often the choice of implementation for a particular interface is decided once per load and never changed. In fact, Windows manifests (among their other uses) provide a rare example of such a mechanism that explicitly happens at load-time. If the programmer has a way of signalling to the loader that this is the case, further optimizations become possible, including inlining, eliminating the need for vtables, etc. Discovery of this too is made easier by having a structured intermediate language.
|
-
I'm going to start with a top-down view of what this language should be. A good place to start would be the distinction between managed and unmanaged platforms and languages.
Most research on the matter has shown that managed code does a lot to reduce development time and makes it easier to write reliable and secure software. From personal experience, once I learned Java, I rarely used C++ anymore (well, in college and for just-for-fun projects at least; I've had to use C++ for work). Is managed code the way to go for new general-purpose languages? On this point, I am not certain. For one thing, I don't think it is really useful for systems programming. I know there is conflicting evidence on this question. On the one hand, IBM has shown that Java, with aggressively optimized JIT compiling, can outperform C++ on a wide range of tasks previously thought to be inappropriate for managed code. They even wrote a JVM in Java, and it was faster than Sun's JVM. On the other hand, Microsoft's practical experience with Longhorn has shown that perhaps managed code should be used sparingly in systems code.
There are a couple of different factors at play here. One is that Java is a simpler language, and therefore easier to optimize, than C#, the Microsoft managed language of choice. Therefore, many of the features of C# that are aimed more at performance than ease of use, which I might add probably do make well written C# code faster than well written Java code in most normal usage scenarios, work against C#'s performance in the aggressively optimized IBM research-style scenario. Perhaps this is a matter of compiler maturity, perhaps not. Another factor is that a complete operating system, and particularly a feature-laden one such as Windows, is a completely different domain than a virtual machine. Taken together, these factors make me leary of pushing a managed language as a systems language. I should note however that I see promise (perhaps many years down the road) for Microsoft Research's Singularity project, which would be an operating system written almost entirely in managed code. The difference with this project is that the verifiability of managed code allows isolation of "processes" without actually separating their address spaces, which could cut out the overhead of context switching entirely. But again, I don't see this effort coming to fruition for several years at least, and it is still uncertain whether the performance of such an operating system would be acceptable.
So managed code is probably good and beneficial for a vast majority of applications, but not for everything. What then is wrong with using a managed language (either one of the existing ones or a new one) for applications programming and an unmanaged language for systems programming. This idea works, but I don't think it is ideal, for a few reasons. First, there has been woefully little development of new systems programming languages. The only one I have seen recently that impresses me is Digital Mars's D programming language, which is a general-purpose, unmanaged language, with some aspects of managed languages (such as object references and garbage collection). Digital Mars should be applauded for their work, but I like to think that in the systems programming world, they are really only 1 step removed from C, despite the fact that their language incorporates features from a few generations of (non-systems programming) languages in between. How useful D is in systems programming will provide a great deal of information for the next generation of systems-programming languages (including, hopefully, the one I plan to develop).
Second, there is some "impedance mismatch" between almost any two languages, and some efficiency, both in programming effort and performance, is lost when applications and the operating system are written in languages that weren't a priori designed to work well together. Anyone who has written C++ code that interacts with Java via the JNI should know what I am talking about. Perhaps the performance aspect is an acceptable tradeoff, but in my opinion the extra programming effort is not. The designers of D, for example, made sure that it was easy to call C functions: no shims or thunks should be necessary.
Third, there will always be specialized applications for which the requirements on the expressiveness of the language far exceed what the average application needs. I think specifically of computationally intensive scientific and numerical programming. To include support for these in a purely managed language may put an undue strain on the average application (via larger instruction size, or the inability to optimize as aggressively normal application code that must interact with the specialized code). I could be wrong about this, but I don't think a purely managed language is the best place to try to shoehorn in these kinds of specialized extra features. And if the managed language doesn't support them, the programmers that need these features will continue (by no fault of their own) to use C, C++, and Fortran (yikes!). As I mentioned in a previous post, those programmers then lose the advantages of using more modern languages.
So what am I suggesting? An unmanaged language? Sort of. Here is the difference in my grand vision. I imagine an intermediate language playing a greater role. I don't mean an intermediate language of bytecodes corresponding to virtual instructions in a virtual machine. I mean an intermediate language that has syntactic structure and all of the expressiveness of the original language (minus the syntactic sugar), but that can be efficiently expressed in a binary format and more easily verified than the human-readable version. I see no reason to believe that it is only managed languages for which thorough verification is possible. After all, Ada (well, at least pre-Ada 95) is an unmanaged language, and a very expressive one at that, for which static analysis has been very successful in proving correctness of programs. In addition, new tools and technologies seem to already be making headway in verification of unmanaged code. I mention here only Microsoft's Prefast tool (not yet available as a customer tool, but perhaps soon) and Digital Mars's design by contract extensions to C, which have already been picked up by other language designers.
The execution model would leave whether or not to compile to machine code up to the loader. Note that due to the nature of the intermediate langauge I suggest, interpreting the intermediate language would be necessarily a more complicated endeavor than for bytecode languages such as Java bytecodes or the .NET CLR, so for most programs it would make sense to compile to machine code. For systems code, it would be absolutely necessary. I have other reasons to consider this execution model, but I will get into them in a later post.
An important consideration with this execution model is load time, but this issue can be overcome with a proper implementation of the loader. I imagine a loader that uses some type of structured storage such as a database to cache partially pre-JIT'ed versions of the code, so that optimization can be done incrementally where needed. In fact, it may be prudent to have the intermediate language be so structured. This would allow for inline assembly (if it turns out the language needs it; I hope to make it rare) to be more seemlessly included in the intermediate code.
|
-
This is a tricky question. There are probably plenty of features in special purpose languages that don't belong in a general purpose language. However, there are two reasons that we should consider these features:
- Some special purpose languages (such as SQL) are widely used from programs written in general purpose languages, and the fact that the general-purpose language doesn't understand the features of those languages makes it difficult to verify correctness at compile time.
- General-purpose languages tend to have more of the proven features that make development in general easier and the end products more reliable, secure, and performant. However, in certain specialized fields of programming, it is often easier to write your programs for a special purpose language, and you lose the advantage of those good, general-purpose features. If it is the case that old languages adopt changes slowly, it is even more the case that old special purpose languages adopt changes slowly. It is also true that new special purpose langauges don't always adopt proven features from general-purpose languages, because of a feeling that they are not needed. If so-called "special purpose" features are added to a general-purpose language, it is more likely that those that need/want those features will use the general purpose language and get the corresponding benefits.
Of course, sometimes adding a special purpose feature will make things more difficult for everyone else, in which case it is probably not a good idea. These things all need to be considered carefully.
|
-
Briefly, I don't know that I can. But even if what comes out of this isn't a language that will be widely adopted, I still think this is a worthwhile exercise, because:
- Features I try out that turn out to be good may find their way into other existing languages.
- Features I try out that turn out to be bad will be a good warning to both those who want to add similar features to existing languages and those who want to include similar features in new languages.
Also, it's fun, and I will learn something regardless. Frankly, those reasons are enough for me.
|
-
This is one of those standard questions that everyone who has worked on designing a language has come up against. Instead of getting into specifics, since I can't say for certain at this point what will be better about this new language, let me make the following general points:
- Old languages are resistant to change. Therefore, even when a new language feature has thoroughly proven itself to be useful (for example, by inclusion in a newer language or a non-standard dialect of the same language), it will usually be years until the feature is added to the old language (if ever). There are several reasons for this: standards, wide deployment of compilers and tools for the existing language, the difficulty of making sure old code compiles under the new compiler.
- Pick any widely used language that has been around for a few years, Google it, and you will discover that there exist non-standard dialects, often very many of them. This is particularly true of the languages that have been around the longest (C, C++, Pascal, and Fortran are some of the biggest offenders; Lisp is perhaps as bad, although it is not as widely used). The fact that so many developers would go to the trouble of writing a new compiler to get language feature X not available in the standard language, although not proof that the language is irreconcilably limited, ought to be a clue that there are problems.
- If language B is developed after language A has been deployed, and if the designers of language B carefully consider the strengths and weaknesses of language A, then the designers of language B are inherently at an advantage to the designers of language B. All else being equal, language B will be better than language A. "All else being equal" is of course a tricky phrase meant to sidestep questions like "What if language B is essentially language A with some silly feature X added that turns out to be a total bust?" with responses like "All else being equal, the designers of language A would have added silly feature X." I don't want to push the point that new == good, because I don't believe it myself. But just as new != good, it is also true that old != good, and from my first two points, it should be clear that stable != good either. There will never be an end-all, universal programming language that everyone uses (despite what some hard-core C advocates might push) because improvements are always possible. With that in mind, why not try and make those incremental improvements?
- This is not a reason in itself, but I just want to preemptively counter the argument "You added feature X to your language; guess what, I can reproduce your feature X in my language with <complicated sequence of instructions>." This is all good and fine if you don't have any choice but to continue using the old language, but if it is truly a worthwhile feature to have, I believe it is almost always better that the compiler understands it directly. A good example is the static (compile-time) assert. The fairly new D programming language has a static assert as a language feature (although perhaps it has been done before, this is the first I saw it). You can get a static assert in C/C++ using a wacky macro that declares an array of size 1 if the assertion is true and size -1 if the assertion is false. However, when the assert turns out to be false, the compiler doesn't give you a very helpful error message. This trick for a static assert has been around for a while, but I am not aware of any major compilers that handle it nicely. In addition, this type of workaround to a missing language feature tends to take longer to compile that it would if the compiler were aware of it.
|
-
I have been interested in language design for a few years now. I guess my fascination started in college when I was exposed to a large number of diverse langauges as part of my coursework. This only grew as I explored the many languages out there that, due to limited purpose, limited acceptance, or novelty, require some independent research to discover. I have been further inspired by my work at Microsoft, especially the difficulties in dealing with legacy code and the limitations of the languages we use, but also the tools, such as Prefast, that have been created to overcome these limitations. There are many languages out there with useful features unique to that language; when I encounter such a feature, I generally try to imagine how this feature could be integrated into a new general-purpose language. I also have a few new ideas that I haven't yet seen in a programming language. Trying these out new ideas seems to me a worthwhile pursuit.
The purpose of this blog is to present my ideas for discussion, and to ask other developers who are interested in such discussion for their own ideas. I hope that the results of this discussion will be a new general-purpose programming language (or at least the skeleton of one).
One might ask "Why is another general-purpose language needed?", or even "Is another programming language needed at all?". Let me defer these questions for a moment to concede that these types of questions bring up a good point. There are already a lot of languages out there, many of which purport to be general purpose. Several are standardized and already widely in use (an incomplete list, in no particular order: C, C++, C#, Java, Pascal, Fortran, Ada, Perl). Some have had trouble getting off the ground, are used only by a small number of advocates, or are too new to fairly evaluate their usefulness (Ruby, Python, Haskell, Boo, Lisp, OCaml, D, etc.). Finally, many people out there who write software, whether they are developers in the traditional sense or researchers, system administrators, and others who write code only as a task incidental to their other duties, are quite happy to use special purpose languages (SQL, Matlab and the like, scripting and shell languages, computer algebra systems) when they feel these languages are the most appropriate. By the way, my characterization of these languages is somewhat arbitrary; I only want to make the point that objections to a new general purpose language might better be classified as follows: "What is wrong with the languages we already have?", "How can you do better than the many new and struggling languages out there?", and "Why do features in special purpose languages have a place in a general-purpose language at all?". I will address each of these questions shortly.
|
|
|
|