Welcome to MSDN Blogs Sign in | Join | Help


You can meet me here, too:

http://professor-fish.blogspot.com/2009/08/same-guy-as-grammarware-haskellware.html

Regards,

Ralf

 

I have been talking about API migration a lot lately.

I have even a draft paper in limbo ... so much in limbo that I don't even link to it here.

In fact, I admit that it is very difficult to get anything done scientifically and practically on this topic.

 

Thanks to some recent collaboration, there are some cool results and a new draft paper.

Feedback very much appreciated.

 

Regards,

Ralf

 

http://www.uni-koblenz.de/~laemmel/xomjdom/

Transcript of an API migration for two XML APIs

Status
  • Paper submitted for publication
  • Wrapper implementation uploaded
 
Authors
Thiago Tonelli Bartolomei and Krzysztof Czarnecki and Ralf Lämmel and Tijs van der Storm
 
Abstract
API migration refers to the problem of adapting an application such that the usage of a given API (the source API) is replaced by the usage of an alternative API (the target API) serving the same domain. One may attempt to automate API migration by wrapping or some form of (source- or byte-) code transformation. API migration is relatively well understood for the special case where source and target APIs are essentially different versions of the same API. API migration is much less understood for the more general case where source API and target API have been developed more or less independently of each other. The present paper exercises a simple instance of the general case and develops engineering techniques towards the mastery of API migration. That is, a wrapper-based migration between two prominent XML APIs for the Java platform is studied where the various differences between the chosen APIs are identified, classified and counted in a systematic manner.

I am about to finish a short article on SYB in Prolog.

Deadline seems to be in 30 hours from now.

If anyone has suggestions, I'd appreciate it.

Thanks,

Ralf

http://www.uni-koblenz.de/~laemmel/OdeToProlog/

 

Scrap Your Boilerplate---Prologically!

Status (30 June 2009)
Paper available in draft form.
Code distribution not yet available online.

 

Author(s)
Ralf Lämmel

 

Abstract
``Scrap Your Boilerplate'' (SYB) is an established style of generic functional programming. The present paper reconstructs SYB within the Prolog language with the help of the univ operator and higher-order logic programming techniques. We pay attention to the particularities of Prolog. For instance, we deal with traversal of non-ground terms. We also develop an alternative model of SYB-like traversal based on metaprogramming. This generative, type-driven model is also amenable to type-driven optimization.

 

Keywords
Scrap Your Boilerplate, Haskell, Prolog, Stratego.

 

Bibtex entry
@inproceedings{OdeToProlog,
author = "Ralf L{\"a}mmel",
title = "{Scrap Your Boilerplate---Prologically!}",
booktitle = "Proceedings of the 11th ACM SIGPLAN international conference on Principles and practice of declarative programming",
publisher = ACM,
year = 2009,
note = "6 pages"
}

 

Downloads

retweewter V1 (C) 2009 Ralf Laemmel

README

What is that retweewter?

retweewter is a Twitter application. It is a toy application. It's
absolutely useless, equally redundant and nearly dysfunctional, but it
is fun doing this sort of code and app, and the author will keep it
alive for some time (such as mins to days) until he lacks further
interest or faces the outburst of unsatisfied users.

retweewter runs (or could run) continuously to retweet certain search
results to a designated user. The relevant tweets are identified by
(hashed) search strings. With retweewter, some community can agree on
a specific tag to be used throughout their messages and thereby
trigger those tweets to be aggregated on an designated twitter
account.

If this readme is up-to-date, then it is true that the author of
retweewter actually runs one installation of the app that can serve a
number of communities as described above. If you like to be covered by
that server (without any warranty and liability on the author's site),
please send the "input" as described below.


What input is needed for setting up retweewter (retweeting) for you?

 - user: the Twitter user to whom retweets go as updates
 - password: the password of the user (needed by the app clearly)
 - tag: search string (the heading "#" is added if it's not there)

See the FAQ for ways of submitting that information.
(See contacting the author.)


How should I think of retweewter's behavior?

 - retweewter only checks every few minutes or so for new
   tweets. (Most of the time retweewter is sleeping.) It depends on
   the search function of Twitter to find tweets that involve the hash
   string. Because of that, there may be some additional delay.
   Use some proper software instead of retweewter if you can't wait.

 - retweewter uses the Twitter API in a manner that it only looks at a
   small window of tweets. The current version does not use any paging
   and database support. The way retweewter is configured it could
   miss tweets if there are many (say 10, perhaps even less) within
   the turn around cycle of retweewter. That's entirely your problem.

 - retweewter tries to identify retweets that are no longer
   appropriate because the underlying tweet (from which the RT was
   generated) is gone. This feature is pretty experimental. It is
   entirely possible that the windowing approach of retweewter causes
   false positives and negatives. In fact, this whole deletion
   business made the experiment somehwat interesting.


More Frequently Asked Questions

 - Is retweewter free software or what? You are free to try using
   it. You are not supposed to pay me anything or anyone for this app.
   I might eventually publish the sources, and if so, the software is
   even more free.

 - Is retweewter developed in Haskell or what? It should have been
   clearly! The only reason that it was done in Java is that I am
   using Twitter programming in my Java-based advanced programming
   class and those students are clever enough to tell Haskell by the
   file extensions of the programs.

 - How can I contact the author of retweewter and stay in touch with
   the irrelevant details of retweewter's future? You can use the tag
   #retweewter in order to be retweeted to @retweewter, or you reply
   to @retweewter, and you may as well follow @retweewter. You can as
   well bother the author directly -- this is @notquiteabba.

 - Can't you think of a more convenient way for registering with
   retweewter? Sure I can, but this application isn't serious at all.
   If I end up getting users, I need to work on scalability -- for
   which I simply don't have time. However, I might add a
   twitter-based subscription feature anyway. More likely, I will
   discontinue retweewter very soon.


Thanks,
Ralf Laemmel

So we are looking forward 8 "long" tutorials on different parts of the transformation/generation space of software engineering. For example, Bran Selic will stand in for model-based engineering and Jim Cordy will sketch his ultimate TXL cookbook. See the full program below. In addition, we will host a "research 2.0" event with Jean-Marie Favre and his companion Denis Avrilionis of One Tree Technologies as the moderators or speakers. Yet in addition, there are another 6 short tutorials. We are also in the process of putting together a participants' workshop. Combined with the social program, this will be a packed week as in previous years.

At this point, we have 80+ registered participants.

REGISTRATION DEADLINE:
June 7

The Call for Participation follows.

Regards,

Ralf

 

GTTSE 2009, 06-11 July, 2009, Braga, Portugal

3rd International Summer School on
Generative and Transformational Techniques in Software Engineering

http://gttse.wikidot.com/

REGISTRATION DEADLINE:
June 7


SCOPE AND FORMAT

The summer school brings together PhD students, lecturers, technology
presenters, as well as other researchers and practitioners who are
interested in the generation and the transformation of programs, data,
models, meta-models, and documentation. This concerns many areas of
software engineering: software reverse and re-engineering,
model-driven approaches, automated software engineering, generic
language technology, to name a few. These areas differ with regard to
the specific sorts of meta-models (or grammars, schemas, formats etc.)
that underlie the involved artifacts, and with regard to the specific
techniques that are employed for the generation and the transformation
of the artifacts.  The tutorials are given by renowned representatives
of complementary approaches and problem domains. Each tutorial
combines foundations, methods, examples, and tool support. The program
of the summer school also features invited technology presentations,
which present setups for generative and transformational
techniques. These presentations complement each other in terms of the
chosen application domains, case studies, and the underlying
concepts. The program of the school also features a participants
workshop. All summer school material will be collected in proceedings
that are handed out to the participants. Formal proceedings will be
compiled after the summer school, where all contributions are
subjected to additional reviewing.

The formal proceedings of the previous two instances of the summer
school (2005 and 2007) were published as volumes 4143 and 5235 in the
Lecture Notes in Computer Science series of Springer-Verlag.


TUTORIALS

   * Software Product Line Refactoring
     Paulo Borba, Univ. Federal de Pernambuco, Recife, Brazil

   * The TXL Source Transformation Cookbook
     James R. Cordy, Queen's University, Kingston, Canada

   * Chasing Diagrams in the Mapping Forests of Model Transformations
     Zinovy Diskin, Univ. of Waterloo and Univ. of Toronto, Canada

   * Generating Language Tools with JastAdd
     Görel Hedin, Lund Institute of Technology, Sweden.

   * Model Driven Language Engineering with Kermeta
     Jean-Marc Jézéquel, IRISA, Rennes, France

   * Rascal: Meta-programming Made Easy
     Paul Klint, CWI and Universiteit van Amsterdam, Netherlands

   * Sourcerer: Slicing and Dicing Large Amounts of Open Source Code
     Cristina Videira Lopes, University of California, Irvine, USA

   * The Theory and Practice of Modeling Language Design for
     Model-Based Engineering
     Bran Selic, Malina Software Corp., Canada


RESEARCH 2.0 EVENT

   * Research 2.0 and Software Engineering 2.0
     Jean-Marie Favre and Denis Avrilionis
     OneTree Technologies SA, Luxembourg and
     University of Grenoble, France


SHORT TUTORIALS

   * The 'Archæology' of Transformations
     Michel Wermelinger and Yijun Yu, The Open University. UK

   * Compilation for Embedded and Reconfigurable
     Computing Architectures
     João M. P. Cardoso, University of Porto, Portugal

   * Lightweight Domain-Specific Language Processing in Kiama
     Anthony Sloane, Macquarie University, Australia

   * Genesys: Service-Oriented Construction of Property
     Conform Code Generators
     Tiziana Margaria, University of Potsdam, Germany

   * The Need for Early Aspects
     Ana Moreira, Universidade Nova de Lisboa, Portugal

   * Model Transformation Chains for
     Model-Driven Performance Engineering
     Mathias Fritzsche, SAP Research


ORGANIZATION COMMITTEE

   * João M. Fernandes (Program Co-Chair), Univ. do Minho, Portugal
   * Ralf Lämmel (Program Co-Chair), Univ. Koblenz-Landau, Germany
   * João Saraiva, Universidade do Minho, Portugal
   * Joost Visser, Software Improvement Group, The Netherlands


ADDITIONAL INFORMATION

For additional information on the registration, program, venue, and
other details of the summer school, please consult the web page:

   http://gttse.wikidot.com/

There is also a contact email address:

   gttse09 AT list.uni MINUS koblenz.de.

I enjoyed delivering an invited lecture at the PhD autumn school IPA 2008.

Abstract

For about 4 years now I have been sporadically talking about API migration as a scientific and technical challenge as well as a potential business idea. It all started with some background in grammar engineering, formal semantics, and re-engineering which made me see that APIs are tricky software languages that account for a good part of the intricacies in automated software evolution. I was fired up when I acquired a handle on Microsoft's XML APIs during my recent work on the XML team at Microsoft. At this point I gained confidence in the business value and the eventual feasibility of API migration. While API migration is arguably the top subject on my research agenda, I haven't made substantial progress over the last few years. Meanwhile others have made both expected and unexpected contributions to the subject, which I would like to put into perspective. However, I contend that the hard problem is still to be attacked. In this talk, I am going to reexamine the landscape of API migration, and submit a mid-term call-to-arms to cross the API migration's Rubicon. I will also report on some experiments or experiences with API usage analysis, API protocol specification, API-related flow analysis of programs, and transformation technology for API migration.

 

My slides are also online here. I am happy enough to work with Tijs van der Storm (CWI, Amsterdam) on this very topic. Tijs is currently visiting researcher at our department in Koblenz. We expect to release a paper on the "API migration's Rubicon" soon. Please stay tuned. The slides also show a few diagrams for Java API usage. We do such usage analysis to better understand the realities of APIs. (Thanks are due here to two students of mine: Ruwen Hahn and Jürgen Starek.) Again, we hope to publish some results of this project very soon.

 

Regards,

Ralf

 

So here is the long promised paper on the Java case study for grammar convergence.

Getting all these huge Java grammars to converge was not easy. We are still learning how to do this properly. We are still working on our transformation language, but we feel the results are good enough to go public now. The magic thing about grammar convergence is that it pushes you to necessarily find all the details of grammar variation. Arguably, grammar convergence should be more automatic. However, by asking you to make the transformation decisions, it helps you to understand the result.

From now on, suites of related grammars should have no excuse to be flawed and close-lipped as far as their relationships go.

 

Regards,

Ralf

 

PS: Now that I repaired the link I can as well put the abstract here.

 

Java is Not Syntax-safe -- Apparently

Authors
Ralf Lämmel and Vadim Zaytsev

 

Abstract
Using a mechanized process, we have reverse-engineered the relationships between the 6 different grammars that are contained in 3 versions of the Java Language Specification (JLS). To this end, we have extracted the grammars from (the HTML representation of) the JLS, and we have systematically resolved or captured all accidental or intended differences by grammar transformations. The project reveals a considerable number of problems with the grammars such as productions that did not define the intended language.

 

 

I have turned a bit too much into an org animal over the last year or so: several conference organizations, special issues in journals, way too many PC member ships (in fact, there are too many workshops that are like mini conferences). Add to this that the switch to a full(time) German Professor ruins all previous working habits. Now that I have seen the “dark side” (not speaking of any big corporation here), I even more admire those folks in academia who do loads of quality courses, and still do substantial research – perhaps even with getting their own hands dirty. Anyway, I guess, this is the reality shock that you can’t avoid when you switch from industrial research (or before that, academic research with a way lower teaching load) to the mode of a university teacher of the kind of a German Professor. Not sure why anyone would do this except for the “Anstellung auf Lebenszeit”. (I can imagine a few more reasons but I save you my sarcasm.)

My colleagues keep telling me that things get better, and arguably, I might have arrived at the turning point. Of great help is here that Vadim Zaytsev (a PhD student of X and me from Amsterdam; Vadim is getting closer to the finish line) has joined the software-languages team in Koblenz last spring. This is particularly exciting because he has profound background in grammarware engineering – a topic that I have been talking about a lot over the last few years without delivering any new stuff except perhaps for the various X/O/R papers. For the last 3 months, we have been working on some grammarware engineering activities. We have been hacking in the best sense of the word. Today, we are proud to reveal our work on “grammar convergence”.

I have just uploaded an intro paper on grammar convergence to my website, but expect some more stuff over the next few weeks and months. (This paper has just been submitted, and your feedback is welcome.) I quote from the conclusion of the paper: “If unit testing is the simple, pragmatic, and effective method to generally validate the I/O behavior of software modules, then grammar convergence is the simple, pragmatic, and effective method to keep scattered grammar knowledge in sync. In addition, the method can be used to capture the intended or the accidental differences between instances of scattered grammar knowledge. Further, grammar convergence also applies at the instance level (populated by XML trees, derivation trees, parse trees, etc.). That is, it can compare and converge data from different software artifacts.”

Final plug: Are you attending SLE 2008 – the First International Conference on Software Language Engineering? It’s end of September in Toulouse (which is in France, for geographically challenged readers). The scientific quality of the event taken for granted, let me mention that I have heard of phenomenal cuisine aspects.  Please consider attending. This conference on Software Language Engineering unites some communities that previously had mainly workshops as their forum. Imagine this: we had 100+ abstract submissions for this first iteration of the conference. Well, 90 sharp actually made the deadline in the end and submitted a proper paper. So we will have 18 papers over 2 days in the enjoyable surroundings of Toulouse and the MODELS conference with its various workshops – many of them also relevant for Software Language Engineers. Looking at the program, I see folks with background in refactoring, compiler construction, programming languages, declarative programming and executable specification, visual languages, modeling languages, formal methods, grammar-based programming, software transformation, software re- and reverse engineering, and model-driven development. Of course, I am biased, but this is a conference where I am sure that my broad interests are sufficient to appreciate each and every paper. The key is that all presentations assemble around the notion of software language engineering.

 

Regards,

Ralf

Berenike, Bernadette, and Isabelle are all doing great. I tend to put photos of my kids online when they get born, and then never again, which I admit is strange; see here and here. (There are only two links because B&B are twins.) I guess we have just so many photos of them that we don’t know which of them to put online. Also, Berenike and Bernadette are turning 9 next year, and they start to get embarrassed by their parents’ actions. Shooting and publishing photos may fail to see their approval. I missed the short window of doing so. Isabelle who turns 3 next year is a fast learner and great imitator; she is even less permissive than her big sisters.

Speaking of Isabelle and “approval”, I have uploaded a new paper (draft) to my website. It uses Isabelle/HOL to develop a sort of metatheory for traversal strategies. It is my first project on theorem proving. (I am sane enough not to count type-class-based programming in Haskell as theorem proving.) (Also, don’t get confused with the URL’s ending, “isabelle2”. There is really just one Isabelle/HOL paper so far. The URL helps me to avoid confusion with my youngest daughter’s photo directory, even though it is on a different host.) This is joint work with Markus Kaiser, a Mathematician, who has worked on theorem proving for some time (mostly in the context of cryptology) before he joined the SLE team in Koblenz.

This paper has just been submitted. Hence feedback, if any, is appreciated, and will be considered. The actual code distribution is still a bit of a mess, as of writing, but it will be fine, if the paper’s web site says so – soon, hopefully.

 

Regards,

Ralf

 

Title: A Formal Model of Traversal Strategies

Authors: Markus Kaiser and Ralf Lämmel

URL: http://www.uni-koblenz.de/~laemmel/isabelle2/

Abstract: There is an observable increase in interest in traversal expressiveness in programming: there are new rewriting languages, programming-language extensions, and combinator libraries that enable programmers to design and deploy traversals over structured data. Such traversal programs can go wrong in new ways. For instance, a traversal may suddenly fail or diverge. It is important to gain better understanding of properties of traversal programs so that programmers make more knowledgeable choices of traversal schemes, and programming environments may provide improved documentation, targeted checks, and other support. This paper presents a machine-checked, Isabelle/HOL-based, formal model of traversal strategies. Amongst others, the model provides sufficient conditions for traversal programs not to fail and not to diverge. The model with its mechanized proofs makes systematic use of Isabelle/HOL's various capabilities for dealing with recursion or induction.

 

 

Update: See the web site at http://gttse.wikidot.com/

______________________________________________

 

What: 3rd Int’l Summer School on Generative and Transformational Techniques.

Acronym: GTTSE 2009.

When: July 6 – 11, 2009.

Where: Braga, Portugal.

Speakers: to be revealed soon.

Previous editions: GTTSE 2005, GTTSE 2007.

Size: 100 = 80 participants, 16 speakers, 4 organizers.

What’s new this time?

·         Do you think anything was wrong? Say with …

o    ... the level of sophistication?

o    … the intensity of learning experience?

o    … the sheer number of nerds locked up together?

o    … the beautiful surroundings?

o    … the exercises in the swimming pool?

o    … the all-inclusive Portuguese dinners?

o    … the splendid social program?

o    … the low price of travel and attendance?

o    What else?

·         We will be using a painless selection procedure this time.

·         We want to definitely increase participation of women.

·         We are working on adding 1 or 2 satellite events.

·         There will be one sports event every day.

·         We aim at stronger industrial presence.

 

What could you do now?

·         Mark the date!

·         Spread the message; we want the best GTTSE talents to come.

·         If you have an idea for a short tutorial, contact us ASAP.

·         If you consider organizing a satellite event, contact us.

·         If you have ideas for industrial coverage, again, contact us.

·         We very much appreciate sponsors of course.

 

Your summer school team for 2009:

·         João M. Fernandes, Universidade do Minho, Braga, Portugal

·         Ralf Lämmel, University of Koblenz-Landau, Germany

·         João Saraiva, Universidade do Minho, Braga, Portugal

·         Joost Visser, Software Improvement Group, Amsterdam, NL

 

Short hall of fame (previous full tutorials only):

·         Don Batory, The University of Texas at Austin, USA

·         Ira Baxter, Semantic Designs Inc., USA

·         Jean Bezivin, INRIA, LINA, University of Nantes, France

·         Shigeru Chiba, Tokyo Institute of Technology, Japan

·         Krzysztof Czarnecki, University of Waterloo, Canada

·         Jean-Marie Favre, University of Grenoble, France

·         Jean-Luc Hainaut, University of Namur, Belgium

·         Zhenjiang Hu, University of Tokyo, Japan

·         Stan Jarzabek, National University of Singapore

·         Erik Meijer, Microsoft, Redmond, USA

·         Tom Mens, University of Mons-Hainaut, Belgium

·         Oege de Moor, Oxford University, UK

·         José Nuno Oliveira, University of Minho, Portugal

·         Markus Pueschel, Carnegie Mellon University, USA

·         Walid Taha, Rice University, USA

·         Eelco Visser, Delft University of Technology, NL

 

I have been somewhat silent for all kinds of boring reasons, but also quite so because I am terribly slow in grasping even basic category theory. (You don’t need to have any such knowledge to enjoy this post – just a bit of time because I guess this is going to be the longest blog post ever.) Still all the categorical pain was worth it, and here is why … The Expression Lemma captures a fundamental correspondence between Functional and OO Programming. In this blog post, let me try to explain the lemma in an easygoing manner. I will cut off all mentioning of the Expression Problem which constitutes arguably a motivation for this research. Let me focus on a very obvious and compelling motivation: the lemma provides a foundation for refactoring OO programs of a certain scheme to functional programs of a certain, associated scheme (and vice versa; in case, the latter is really found useful and not considered a criminal offense). BTW, the proof of the lemma will not be discussed below. Also, I should emphasize that only the simple expression lemma will be covered here. So let me refer to the paper for all such elaborations. Before I get started I want to acknowledge that I report about joined work with Ondrej Rypacek who is currently presenting our results at the Mathematics of Program Construction conference in Marseille. In fact, Ondrej deserves most if not all credit. Ondrej also blogged about the lemma some time back, and LTU had linked to the post.

 

Recursive operations on a recursive data structure

Consider a recursive data structure; let’s use the abstract syntax of a tiny expression language in the running example. For simplicity, let’s limit ourselves to two expression forms: numeric literals, and binary expressions for addition. Consider some recursive operations that are defined on the data structure. Let’s use “expression evaluation” and some sort of “expression transformation” in the running example. In Java, we can use an abstract class, Expr, to model the data structure with abstract methods, eval and modn, to model the operations. (We could also use interface polymorphism.)

 

public abstract class Expr {

    public abstract int eval(); // Evaluate expressions

    public abstract void modn(int n); // Transform modulo a constant

}

 

The two expression forms give rise to two subclasses that encapsulate data and operations.

 

public class Num extends Expr {

    private int value; // State

    public Num(int value) { this.value = value; } // Constructor

    public int eval() { return value; }   

    public void modn(int n) { this.value = this.value % n; }

}

public class Add extends Expr {

    private Expr left, right; // State

    public Add(Expr left, Expr right) { // Constructor

        this.left = left;

        this.right = right;

    }

    public int eval() { return left.eval() + right.eval(); }

    public void modn(int n) { left.modn(n); right.modn(n); }

}

 

Contrasting styles of decomposition

The OO style of decomposition can be contrasted with the functional style (or the style of constructive, algebraic specification, if you prefer). The data variants are defined independently of any operation; operations are defined separately as functions that case-discriminate on the data variants. In Haskell:

 

-- Expression forms

data Expr = Num Int | Add Expr Expr

 

-- Evaluate expressions

eval :: Expr -> Int

eval (Num i) = i

eval (Add l r) = eval l + eval r

 

-- Transform all literals modulo a constant

modn :: Expr -> Int -> Expr

modn (Num i) n = Num (i `mod` n)

modn (Add l r) n = Add (modn l n) (modn r n)

 

Semantic equivalence of OO and functional models

Now the 1 Million $ question is this: Are the two shown programs semantically equivalent, and, if so, how could one possibly prove the equivalence? The expression lemma will arise as an important tool in establishing that semantic equivalence. It is desirable to know of such equivalence because, for example, it would formally justify a refactoring from the OO style of decomposition to the functional one, and vice versa. Some might say that the semantic equivalence is reasonably obvious, but then again, how to formally establish this fact? Also, we are presumably looking for a general class of equivalent program couples (rather than the specific exemplars at hand), but at this stage, it is not obvious how this class would be characterized.

 

After 2-3 semesters of computer science, you may launch the following attempt.

Suppose:

·         j refers to the above Java program

·         h refers to the above Haskell program

·         [| . |]Java assigns semantic meanings to (the relevant subset of) Java

·          [| . |]Haskell assigns semantic meanings to (the relevant subset of) Haskell

 

Then our educated but inexperienced CS student could hope to be able to show that:

 

 [| j |]Java  [|h |]Haskell.

 

Or perhaps I just skipped too many classes and had too much wine over the years, and this could work. Anyway, I reckon that this approach would require “semantic voodoo” because the established semantic domains for the two abstraction mechanisms (i.e., classes vs. recursive functions on algebraic data types) are so radically different. In fact, the above equivalence claim does not make much sense, in its current form, because the I/O behaviors do not even match at the type level.

 

Matching up the input type

The functions of h take term-like data via an argument, while the objects of j readily encapsulate data (state) on which to invoke methods. We can match up the input type by building the objects of the OO program from the same representation that is fed into the functions. That is, expression objects would be constructed from expression terms. Let’s call this step recursive object construction, and designate a static method fromData in the class Expr. Now, if we assume that Java’s and Haskell’s integers can be compared, a “well-typed” proposition about the semantic equivalence of the eval method vs. the eval function can be stated as follows:

 

For all expression terms t:

Expr.fromData(t).eval()

   = eval t

(Java goes left, Haskell goes right.)

 

We cannot cover modn in this manner, as will be clear shortly.

 

Matching up the output type by serialization

That is, the function modn returns public data – expression terms, while the method modn returns opaque data – expression objects. So how can we possibly compare the results returned by the function and the method? An apparently symmetric choice would be to introduce the presumable “inverse” of fromData; say an instance method toData which, in the case of Expr, extracts expression terms from expression objects. In fact, some of you may notice that such a couple fromData/toData would be possibly related to what’s called (de-) serialization in distributed programming or object persistency. So let’s try to match up the output type of methods and functions as follows: once a method has been performed such that a new object is returned (or an existing object is mutated), we would extract the state by serialization, hoping to use the data types of the functional program again for the representation of the externalized state, so that comparisons between functional and OO world make sense. If this all works, a “well-typed” proposition about the semantic equivalence of the modn method vs. the modn function could be stated as follows. (For simplicity of chaining, we assume that modn returns the result of the transformation; hence, it is no longer a void method – something that is, btw, for deep reasons, even more evil for a Smalltalk programmer than a Haskell programmer.)

 

For all expression terms t, for all n:

Expr.fromData(t).modn(n).toData()

   = modn n t

(Again, Java goes left, Haskell goes right.)

 

Let’s scrutinize this use of de-/serialization. Here we note that serialization is “normally” meant to externalize an object’s state for the sake of being able to rematerialize (de-serialize) the object later or elsewhere. Not every object is serializable (in wild life). When an object ends up being serializable, then the serialization format is still effectively private to the corresponding object type, perhaps even to a specific implementation of the type – ask your local OO expert. Do we really want to define semantic equivalence with reference to serialization (or reflection and RTTI), i.e., systematically look into the objects at hand, and patently break encapsulation?

 

Matching up the output type by observation

I contend that semantic equivalence should instead be defined in reference to object interfaces and behavior (rather than “core dumps” extracted from the objects). So let’s try to limit the statement of equivalence to “observations” that are readily admitted by the OO interface. When these observations return “pure data”, e.g., integers, we can easily compare the results from the functional and OO world. For instance, the semantic equivalence of the modn function vs. the modn method could be approximated by using eval to perform observations. Thus:

 

For all integers n, for all expression terms t:

Expr.fromData(t).modn(n).eval()

   = (eval . modn n) t

(Again, Java goes left, Haskell goes right.)

 

This option is in peace with encapsulation, and therefore is to be favored. However, the generality of the option is still to be established. In particular, we need to find a way to universally quantify over all possible observations, and to compare the observations with the results from the functional world. To be continued.

 

Functional object encoding

At this point, we are still struggling with two different host languages in our equations (c.f.: “Java goes left, Haskell goes right”). A formal model is in closer reach if we first eliminated one language. It’s hard to be fair every day. Let’s say goodbye to the OO language; let’s encode OO programs functionally (aka non-dysfunctionally), i.e., by using a functional object encoding. Not much is changed though: the encoding preserves the decomposition style of the OO program, and the use of objects including encapsulation. We use Haskell to encode the encoding. There are the following steps:

 

1.       Declare an interface functor: this is a type constructor that defines the product of all method signatures for the interface of interest in terms of the state type of an object. The state type is abstracted as a type parameter; see x below. In the running example, the eval method returns an Int; the modn methods takes an Int and returns a transformed copy of “self” (aka “this”). Thus:

 

type IExprF x = (Int, Int -> x)

 

2.       Declare the type of opaque objects: we take the recursive closure of the interface functor. In this manner, the state type is effectively hidden from the observable interface. For convenience’s sake, we also define projections that allow us to invoke the different methods on a capsule. Thus:

 

newtype IExpr = InIExpr { outIExpr :: IExprF IExpr }

callEval = fst . outIExpr

callModn = snd . outIExpr

 

3.       Declare the type of interface implementations: This is just a trivial application of the interface functor. The idea is that an interface implementation observes the state and populates the type for the method signatures as prescribed by the interface functor. In the co-algebraic discipline of functional object encoding, the resulting type is called the co-algebra type for the functor at hand. Thus:

 

type IExprCoAlg x = x -> IExprF x

 

4.       Provide interface implementations: Remember, the Java blueprint contains one concrete class for numeric literals, and another concrete class for binary expressions for addition. In the former case, the state type is Int; in the latter case, the state type is a pair of IExpr objects. These are the corresponding interface implementations:

 

numCoAlg :: IExprCoAlg Int

numCoAlg :: IExprCoAlg Int

numCoAlg = numEval /\ numModn

 where

  numEval = id

  numModn = mod

 

… or in non-pointfree style:

 

numCoAlg :: IExprCoAlg Int

numCoAlg i = (numEval, numModn)

 where

  numEval = i

  numModn n = i `mod` n

 

 

addCoAlg :: IExprCoAlg (IExpr, IExpr)

addCoAlg = addEval /\ addModn

 where

  addEval = uncurry (+) . (callEval <*> callEval)

  addModn = uncurry (/\) . (callModn <*> callModn)

 

… or in non-pointfree style:

 

addCoAlg (l,r) =  (addEval, addModn)

 where

  addEval = callEval l + callEval r

  addModn n = (callModn l n, callModn r n)

 

5.       Provide object constructors: The Java blueprint had constructors. More generally, any class needs a protocol for populating object state. In a functional object encoding, a constructor can be modeled as a function that takes the initial state of an object, and encapsulates it with the method implementations. To this end, the fundamental operation of unfolding comes to rescue; unfold “builds” (or “constructs”) values of a functor’s recursive closure (here: IExpr) from values of some other type (here: integers or subobjects):

 

newNum :: Int -> IExpr

newNum = unfoldIExpr numCoAlg

 

newAdd :: (IExpr, IExpr) -> IExpr

newAdd = unfoldIExpr addCoAlg

 

Conceptually, the unfold operation is defined generically for any functor: (i) apply the argument – a co-algebra (c.f. c below), (ii) map the unfold operation recursively over the positions of the functor’s type parameter (here: the occurrence of self’s type in the arguments and results of methods, c.f. fmapIExprF below), (iii) tie the recursive knot of the functor (c.f. InIExpr below). Thus:

 

unfoldIExpr :: IExprCoAlg x -> x -> IExpr

unfoldIExpr c = InIExpr . fmapIExprF (unfoldIExpr c) . c

 

The definition of the functorial map is induced by the structure of the functor at hand. The IExprF functor comprises one using occurrence of the type parameter in the second component of the pair, specifically in the result-type position of the function type (i.e., in the return-type of the signature of the modn method). The argument of the functorial map is applied to that occurrence:

 

fmapIExprF :: (x -> y) -> IExprF x -> IExprF y

fmapIExprF f (e,m) = (e, f . m)

 

This completes our transcription of the Java code to Haskell.

 

Reality check: computing 42 in different ways

As a simple illustration and clarification of the development so far, let us do some expression evaluation in different settings. In the OO setting, we construct objects by nested object construction on which we invoke the method for expression evaluation. Thus:

 

public class Demo {

    public static void main(String[] args) {

        Expr x = new Add(

                    new Num(39),

                    new Add(

                        new Num(1),

                        new Num(2)));

        System.out.println(x.eval());

    }

}

 

Likewise, in the setting of the original functional program, we construct a nested term to which we apply the function for expression evaluation. While visually similar, it is important to notice that term construction results in plain, public data whereas object construction results in objects that encapsulate private data and behavior.

 

main = do

          let x = Add (Num 39) (Add (Num 1) (Num 2))

          print $ eval x

 

Here is also a demo for the functionally encoded OO program:

 

main = do

          let x = newAdd (newNum 39, newAdd (newNum 1, newNum 2))

          print $ callEval x

 

All programs agree on the result of evaluation: 42.

 

Gory details of recursive object construction

We are now in the position to actually define fromData – as needed for matching up the input type of the two programs. Thanks to the functional object encoding it is now easier to see that it really makes sense to construct expression objects from expression trees. We exploit the fact that the recursive data structure of the functional program coincides with the recursive closure of the union of the OO state types. Here is the type of fromData:

 

fromData :: Expr -> IExpr

 

Our development relies on “Squiggol power” and the entire backlog of functorially parameterized morphisms. Hence, we must redefine the recursive data structure as the recursive closure of its non-recursive functorial shape. (Compare this with the non-recursive interface functor and the separate definition of its recursive closure.) Thus:

 

type ExprF x = Either Int (x,x)

newtype Expr = InExpr { outExpr :: ExprF Expr }

num = InExpr . Left -- serves as constructor

add = InExpr . Right -- serves as constructor

 

The type constructor ExprF captures the functorial shape of the expression forms. That is, it is a sum over Int (corresponding to the case of numeric literals) and (x,x) (corresponding to the case of binary addition) where x is ultimately to be instantiated to the recursive closure of the functor. That closure is taken by Expr. Indeed, the newly defined Expr is isomorphic to the algebraic data type Expr – as it was defined earlier. We can now combine the constructors for the two kinds of IExpr objects into a single function that dispatches on the functorial structure; here we apply ExprF to the opaque object type:

 

newEither :: ExprF IExpr -> IExpr

newEither = newNum \/ newAdd

 

… or more verbosely, by expansion of “\/”:

 

newEither :: ExprF IExpr -> IExpr

newEither (Left i) = newNum i

newEither (Right lr) = newAdd lr

 

It remains to apply newEither in a recursive manner.  The fundamental operation of folding comes to rescue. A fold traverses over a recursive data structure while it is parameterized by a fold algebra, i.e., operations to apply to intermediate results as well as leafs. The “one-layer” constructor newEither serves as the fold algebra in our case. In different words, recursive object construction is defined as a fold over expression terms while the constructor newEither dispatches on expression forms and invokes data variant-specific constructors at each level:

fromData :: Expr -> IExpr

fromData = foldExpr newEither

 

Conceptually, the fold operation is defined generically for any functor: (i) reveal one layer of functorial shape from the recursive closure (c.f. outExpr below); (ii) map the fold operation recursively over the positions of the functor’s type parameter (here: over the positions for subobjects, c.f. fmapExprF below), (iii) combine intermediate results (or process leafs) by applying the fold algebra (c.f. a below). Thus:

 

type ExprAlg x = ExprF x -> x

foldExpr :: ExprAlg x -> Expr -> x

foldExpr a = a . fmapExprF (foldExpr a) . outExpr

 

The functor implies the following functorial map:

 

fmapExprF :: (x -> y) -> ExprF x -> ExprF y

fmapExprF f (Left i) = Left i

fmapExprF f (Right (x,x')) = Right (f x, f x')

 

Arriving at behavioral equivalence

We are still in need of a general method for comparing the results returned by the functions vs. the methods. We already alluded to the use of observations for that purpose. There is one trick we did not yet mention. That is, we can we construct objects (an ADT) from the functional program such that the functional program’s “public” data is encapsulated with the functions to be applied to that data. As a result, we will have objects on both sides of the equation, and hence, can perform observations on both sides, which makes it easier to set up a comparison. The ADT is constructed as follows:

newBoth :: Expr -> IExpr

newBoth = unfoldIExpr both

both :: IExprCoAlg Expr

both = eval /\ modn

 

… or more verbosely, by expansion of “/\”:

 

both :: IExprCoAlg Expr

both e = (eval e, modn e)

 

It’s time for a bold claim:

 

fromData = newBoth

 

That is, the objects obtained by recursive construction are equivalent to the objects obtained by the ADT construction. Let’s clarify the kind of equivalence at hand. Here we note that the terms on both sides of the equation are functions, and the equality sign is meant here as extensional equality, i.e., these functions always return the same result when applied to the same argument. Now, we also need to understand that the functions return objects which are in turn tuples of functions (methods). Further, each such method can be invoked again to return new objects (i.e., yet other tuples of functions). That is, method invocation can be arranged in arbitrarily long chains. Extensional equality of fromData and newBoth implies that all these possible chains also lead to observationally equivalent results. To summarize, the objects constructed by fromData and newBoth  are indistinguishable by the observations (including chains of any length) admitted by the interface. This is really as much as we could hope for within the bounds of encapsulation. Hence, from here on, we replace the vague term of semantic equivalence by the more rigorous term of observational equivalence.

 

Making explicit more structure

We are not yet ready for a proof of the above claim. Intuitively, we need to get to a point where we can show that the functional program and the OO program are based on the same problem-specific ingredients (having to do with data variants for expressions and operations thereupon) which are only composed in different manners. At this point, the functional program is particularly monolithic: two functions written in the style of general recursion. Contrast this status with the functional object encoding of the OO program, where the co-algebra identifies all problem-specific ingredients in a structured manner. It is easy enough to achieve a similar degree of modularity for the eval and modn functions by defining them in terms of fold; it is “well-known” that these forms can actually be derived automatically:

 

eval :: Expr -> Int

eval = foldExpr evalAlg

 

evalAlg = evalNum \/ evalAdd

 where

  evalNum :: Int -> Int

  evalNum = id

  evalAdd :: (Int, Int) -> Int

  evalAdd = uncurry (+)

 

modn :: Expr -> Int -> Expr

modn = foldExpr modnAlg

 

modnAlg = modnNum \/ modnAdd

 where

  modnNum :: Int -> Int -> Expr

  modnNum = ((.) num . mod)

  modnAdd :: (Int -> Expr, Int -> Expr) -> Int -> Expr

  modnAdd = ((.) add . uncurry (/\))

 

Getting a sense for duality

The ingredients evalNum, evalAdd, modnNum and modnNum are really like the equations in the style of general recursion except that recursive components of the left-hand side are already replaced by the recursively computed results, and hence, the ingredients do not involve any recursive function applications of their own. (This is just a trivial and general consequence of committing to bananas.) The OO program is broken down into problem-specific ingredients quite similarly. Let’s compare. The functional program is structured as follows:

 

newBoth

  =   uI both

  =   uI (eval /\ modn)

  =   uI (fE (evalNum \/ evalAdd) /\ fE (modnNum \/ modnAdd))

 

Here, we abbreviate as follows:

  • fE = foldExpr
  • uI = unfoldIExpr

 

The OO program is structured as follows:

 

fromData

  =   fE newEither

  =   fE (newNum \/ newAdd)

  =   fE (uI (numEval /\ numModn) \/ uI (addEval /\ addModn))

 

In both cases, recursion is under the regime of the composition scheme. You don’t have to be a rocket scientist in category theory to smell the duality here. Subject to several conditions on the problem-specific ingredients, we may be able, eventually, to prove that both compositions are observationally equivalent.

 

Matching up problem-specific ingredients

Get a coffee first; this is going to be the hard part of the post.

Playing stupid, one could hope that:

 

      evalNum = numEval

      evalAdd = addEval

      modnNum = numModn

      modnAdd = addModn

 

Quite obviously, this is not the case. (Not even the types fit.) A more complex correspondence has to be established. Let us do some factoring. To this end, we observe again that the functional style of decomposition results in as many “disjunctions” as there are operations; each such disjunction is processed by a fold; these folds are then combined in a conjunction. Dually, the OO style of decomposition results in as many “conjunctions” as there are data variants; each such conjunction is processed by an unfold; these unfolds are then combined in a disjunction. Let us factor out the inner uses of fold and unfold so that the problem-specific ingredients are solely composed by “/\” and “\/”.  For any functor F, the following “well-known” laws allow us to replace conjunctions of folds and disjunctions of unfolds by a single (co-) recursion:

 

The tupling law:

                fF a /\ fF b = fF ((a . F fst) /\ (b . F snd))

 

The co-tupling law:

                uF a \/ uF b = uF ((F Left .  a) \/ (F Right .  b))

 

(For the sake of a dense notation, we use a functor in an expression position to refer to the functorial map for that functor.) The tupling law happens to be the foundation of a powerful, optimizing transformation: if we wish to pair the results of two folds applied to the same argument (based on a split, c.f., “/\”), we can combine their fold algebras in one, and thereby do the work of both folds in one pass of recursion. That is, rather than pairing the final results of two folds, we perform pairing (and projection) at each level of recursion, and thereby suffice with one traversal. We do not care for optimization here, only for factoring. The co-tupling law is the dual of the first one. In OO terms, it tells us how to use a single interface implementation as a surrogate for two implementations of the same interface while taking the union of the state types. Regardless of any intuitions, the laws allow us to factor the functional vs. OO style of decomposition as follows:

 

Functional:

 uI (fE fp)

  where

   fp  =      ((evalNum \/ evalAdd) . E fst)

/\     ((modnNum \/ modnAdd) . E snd)))

 

OO:

 fE (uI oop) 

  where

   oop =      (I Left . (numEval /\ numModn))

\/     (I Right . (addEval /\ addModn))

 

Let’s distribute projections and injections so that we get conjunctions of disjunctions, or vice versa. Thus:

 

   fp  =      (evalNum \/ (evalAdd . (fst <*> fst)))

/\     (modnNum \/ (modnAdd . (snd <*> snd)))

 

   oop =      (numEval /\ ((.) Left . numModn))

\/     (addEval /\ ((.) Right . addModn))

 

The roles of “\/” and “/\” are flipped in fp and oop. The “well-known” abide law can be used:

 

(f /\ g) \/ (h /\ i) = (f \/ h) /\ (g \/ i)

 

Let’s apply the abide law to oop:

 

   oop =      (numEval \/ addEval)

/\     (((.) Left . numModn) \/ ((.) Right . addModn))

 

Now let’s match up the operands of “\/” in fp and oop.

We highlight the differences in bold face.

 

 

       evalNum

   =   id

vs.

      numEval

   =   id

 

 

       evalAdd . (fst <*> fst)

   =   uncurry (+) . (fst <*> fst)

vs.

       addEval

   =   uncurry (+) . ((fst . outIExpr) <*> (fst . outIExpr))

 

 

       modnNum

   =   (.) (InExpr . Left) . mod

vs.

       (.) Left . numModn

   =   (.) Left . mod

      

 

       modnAdd . (snd <*> snd)

   =   (.) (InExpr . Right) . uncurry (/\) . (snd <*> snd)

vs.

       (.) Right . addModn

   =   (.) Right . uncurry (/\) . ((snd . outIExpr) <*> (snd . outIExpr))

 

 

The differences can be understood as follows. The functional program has an extra step of committing results to the recursive closure of the data functor; c.f. InExpr. The OO program has an extra step of retrieving arguments from the recursive closure of the interface functor; c.f. outIExpr. These are really just “leftovers” from the particular decomposition styles. A “common denominator” of fp and oop (referred to as lambda from here on) can be obtained by factoring out (left or right) the extra steps:

 

oop :: IExprCoAlg (ExprF IExpr)

oop = lambda . fmapExprF outIExpr

 

fp :: ExprAlg (IExprF Expr)

fp = fmapIExprF InExpr . lambda

 

lambda = ( a \/ b ) /\ ( c \/ d )

  Where

   a = id

   b = uncurry (+) . (fst <*> fst)

   c = (.) Left . mod

   d = (.) Right . uncurry (/\) . (snd <*> snd)

 

Here is the executive summary of the steps:

 

  • (1.a) Apply the tupling law to the separate folds of the functional program.
  • (1.b) Apply the co-tupling law to the separate unfolds of the OO program.
  • (2.a) Distribute projections into the disjunctions of the functional program.
  • (2.b) Distribute injections into the conjunctions of the OO program.
  • (3.) Apply the abide law to one of the intermediate results (say, to the OO program).
  • (4.) Factor the programs to appeal to the following shapes:

 

oop = lambda1 . fmapExprF outIExpr

fp = fmapIExprF InExpr . lambda2

 

  • (5.) Show (perhaps just “notice”) that lambda1 and lambda2 are equivalent.

 

Step (5.) may fail for one of two reasons:

  • We are not clever enough.
  • The lambdas are different.

 

The type of the common denominator

What would be the type of lambda? As a Haskell programmer, you get used to all kinds of advanced cheating – one is type inference. However, let’s fight like real men, and figure out the type ourselves because this might actually help with understanding the magic at hand. We start from the types of oop and fp:

 

  oop  :: IExprCoAlg (ExprF IExpr)

  fp   :: ExprAlg (IExprF Expr)

 

Let’s expand the type synonyms for (co-) algebras:

 

  oop  :: ExprF IExpr -> IExprF (ExprF IExpr)

  fp   :: ExprF (IExprF Expr) -> IExprF Expr

 

Let’s do step (4.) at the type level:

 

  lambda1 :: ExprF (IExprF IExpr) -> IExprF (ExprF IExpr)

  lambda2 :: ExprF (IExprF Expr) -> IExprF (ExprF Expr)

 

This does not leave much space for a type of the common denominator:

 

  lambda :: ExprF (IExprF x) -> IExprF (ExprF x)

 

The two disagreeing positions of lambda1 and lambda2’s types (highlighted by bold face above) were generalized to the same type variable x. The type of any actual lambda must be at least as polymorphic as shown. We note that lambda‘s type is strictly more polymorphic than the explicit type signatures that we calculated for lambda1 and lambda2. If the factored definitions lambda1 and lambda2 (obtained by steps (1.)-(4.)) fail to type-check against the more polymorphic type, then our method fails to find observational equivalence. In categorical terms, the required polymorphism is what lambda makes a natural transformation, which is essentially a mapping from one functor to another. In fact, lambda is a special kind of natural transformation, i.e., a distributive law. That is, functors of source and target are actually compositions of two functors, and the composition is carried out in flipped order when comparing source and target; c.f.  ExprF . IExprF vs. IExprF . ExprF.

 

Unnatural or non-dualizable programs

It’s pretty easy to end up outside the required polymorphism. There is actually a systematic way of discussing such causes of unnaturality. We simply look at the computed types of lambda1 and lambda2 and discuss all the differences one by one:

 

  1. (Compare the argument position of lambda1 and lambda.)

The OO program is not allowed to invoke method chains on the subobjects. The result of invoking a subobject (by a single method call) cannot be further observed. (This follows from the fact that the general type IExpr in oop (which is isomorphic to IExprF IExpr in lambda1) is restricted to IExprF x in lambda.) This limitation may come over as odd, and a generalized expression lemma is appreciated; see our lovely paper.

 

  1. (Compare the result position of lambda1 and lambda.)

The OO program is not allowed to arbitrarily construct and replace subobjects. A method implementation is bound (by the type) to use the objects returned by the observations of the subobjects to fill in the slots for the subobjects in the result. This is arguably not a bad limitation as it rules out OO programs that sort of arbitrarily rewrite the object graph. That is, the methods are enforced to be compositional in a certain way.

 

  1. (Compare the argument position of lambda2 and lambda.)

The functional program is not allowed to examine the precise expression structure of the intermediate results obtained by recursion. The less polymorphic type of lambda2 exposes the precise expression structure; the extra polymorphism of lambda hides the expression structure. Imposing such a ban on functional programmers resembles the notion of opaque data on the OO side, where one is not allowed to examine state (except when backdoors lack reflection and serialization are leveraged).

 

  1. (Compare the result position of lambda2 and lambda.)

The functional program is not allowed to construct deep terms while combining intermediate results obtained by recursion. The type of lambda only admits one layer of functorial shape in the result; it does not even allow leaving out that single layer of functorial shape. This limitation may come over as odd, and a generalized expression lemma is appreciated; again, see our lovely paper.

 

The Expression Lemma, finally

If we are able to match up the ingredients of a functional program and an OO program using the above steps (1.)-(5.), then the Expression Lemma promises observational equivalence for the different styles of composing the ingredients. The following version of the lemma glances over some technical details; please look at the paper, if you need to know more, and want to scrutinize the lemma or its proof. Given are two functors, T and I, think of T as the data functor (like ExprF in the example), think of I as the interface functor (like IExprF in the example). We assume generic definitions of fold and unfold for any functor. For clarity, we continue to attach the functor in question to any use of fold and unfold. Let’s stop using problem-specific recursive data types, and use the following generic type constructor for taking the recursive closure of any functor:

 

newtype Mu f = In { out :: f (Mu f) }

 

Then, for any lambda :: T (I x) -> I (T x), the following identity holds:

 

                unfoldI (foldT ((I In) . lambda) = foldT (unfoldI (lambda . (T out))

 

(Read as: “Functional and OO programming is not too much different!”)

 

Towards refactoring

We can also use the developed machinery to refactor an OO program into a functional program (or vice versa). For instance, in the direction of OO to functional programming, we first factor the OO program according to the steps (1.)—(4.), if possible. Then we set lambda = lambda1 and lambda2 = lambda. Now, we attempt the steps (1.)—(4.) in inverse order so that the functional program is calculated. Likewise, the other direction can be accommodated. The refactoring fails if we hit a “case without counterpart”:

 

  1. The occurrence of the data functor in the result position of lambda seems to suggest that the implementation of any given operation for any given data variant can freely choose the data variant in the result of the operation. This is true for folds (read: constructor x can be rewritten to constructor y), but imposes a slight challenge on OO programming. That is, an OO method with “self’s” type in the result position of the interface functor must return an object of the same state type. The problem shows up when we attempted the inverse application of the co-tupling law (as part of deriving an OO program from a functional one), which may just fail when lambda was too liberal. We may overhaul the interface functor to use the recursive closure in place of “self”, but the expression lemma will no longer apply. Alternatively, we could stop refactoring before the application of the co-tupling law, but the resulting OO program would now designate one “class” to all data variants, which may count as failed decomposition.

 

  1. The occurrence of the interface functor in the argument position of lambda seems to suggest that the implementation of any given operation for any given data variant can freely refer to the results of applying any operations recursively on components of the data. This is true for functional objects (read as: method x can call method y), but imposes a slight challenge on functional programming. That is, if we start with the assumption that all functions should be separate folds, then they cannot cross-reference each other. However, if we start from (or stop at) a tupled fold, the problem goes away. The problem shows up when we attempted the inverse application of the tupling law (as part of deriving a functional program from an OO one). If we assume that programmers can use the notation of general recursion, it is fair to further assume that the translation into folds directly returns a tupled fold. Hence, we do not face any proper limitation here. 

 

Wrap-up

I hope some of you actually managed to get to the end of this text. I also hope that the text helps with improving the understanding of the fundamental correspondence between functional and OO programming. Of course, there are many loose ends that need more work. For instance, how do we deal with hard-core non-functional objects (when mutation and aliasing cannot be ignored)? Also, what are the exact mechanics of a refactoring transformation? Further, how does the state of the art in fold mining relate to the needs of our method? Yet further, what other ideas for refactoring OO to functional programs must be critically integrated to make a contribution to real-world scenarios such as multi-core parallelization? Finally, I realize that despite this long text, the presentation is still somewhat dense for someone not trained in Squiggol power and Haskell magic; so I hope that Ondrej and me end up writing a comprehensive tutorial on the full subject some day.

 

Have a great summer!

Let me go back to the beach now that I have finished this quick post.

Ralf Lämmel

 

The department of Computer Science, University Koblenz-Landau, Campus Koblenz invites applications for 2 research positions, available initially for 2 years:

* 1 PostDoc
* 1 PhD student

The corresponding funding is part of the state of Rhineland Palatinate's Research Initiative 2008-2011. The successful applicants will work on the research theme of "ADAPT: Modeling and Analyzing Software Adaptation".

The objective of ADAPT is to relate, advance, combine, and challenge adaptation methods and associated methods of modeling and analyzing that are used by the communities of software engineering, programming languages, logic-based modeling, multi-agent systems, formal methods, SOA, web systems, and mobile, autonomous systems. Please consult the ADAPT home page for further details: http://adapt.uni-koblenz.de/

9 research groups from the CS department in Koblenz (from several of its institutes) are associated with the theme. Also, the theme leverages collaboration with international partners at the CWI,Amsterdam, and Chalmers University of Technology, Göteborg. The successful applicants will research in the interdisciplinary context of ADAPT, and be actively involved in further building up and refining the research theme. Thus, the positions provide extra opportunities for qualified applicants to distinguish themselves, in addition to the research aspects and the possibility to work on a dissertation and habilitation thesis.

The deadline for applications is 1 September 2008. Email applications are preferred. See the contact section on the ADAPT home page.

Participants:
* Bernhard Beckert (Formal Methods and AI, Spokesperson)
* Jürgen Ebert (Software Engineering)
* Ulrich Furbach (Artificial Intelligence, Spokesperson)
* Rüdiger Grimm (IT Risk Management)
* Ralf Lämmel (Software Languages, Spokesperson)
* Dietrich Paulus (Active Vision)
* Steffen Staab (IS and Semantic Web)
* Klaus Troitzsch (Empirical Methods, Modeling and Simulation)
* Dieter Zöbel (Real-Time Systems and Mobile Systems Eng.)

 

http://planet-sl.org/sle2008/ 

1st International Conference on Software Language Engineering (SLE)

There are slightly extended deadlines:

**Revised submission dates: Abstract submission: July 16th; Paper submission: July 21st**

I also look fwd the following keynote speakers:

  • Anneke Kleppe, Capgemini, The Netherlands
  • Mark van den Brand, Technical University of Eindhoven

Obviously, I also look fwd being in Toulouse.

The next 2 weeks I am going to enjoy the Baltic Sea though. 

Regards,

Ralf

I am crazy about language processing!

It is an excellent way to think deeply about programming, programming languages, metaprogramming, and it is the foundation of automated software engineering. Over the last 15 years or so I have written countless language processing components (parsers, pretty printers, transformations, etc.). Many of them originated in a teaching context. Others originated in the context of developing or exercising language processing infrastructures – as part of my research work. Yet others were implied by my continued interest in formal (and executable) semantics as well as type systems and program analysis. Further, I am a fan of declarative languages and programming methods (think of Haskell, Prolog, Attribute Grammars, DCGs, etc.) – language processing is a home match for declarative programming, and I am playing it every now and then. Finally, I have also worked on software re- and reverse engineering projects (both in industry and academia), and guess what, language processing is a recurring theme there.

 

Language processing and teaching

These days, the software engineering and programming language communities discuss language processing-related issue a lot. Just think of all the intensive efforts focused at MDE, DSLs and software language engineering. The observable level of importance of language processing makes me wonder about some educational aspects:

How can one actively contribute to a matured body of conceptually founded and practically applicable knowledge about languages and language processing? How can one efficiently disperse such knowledge in university courses without though necessarily assuming entirely new courses?

The new SLPS project (read as Software Language Processing Samples) may hold one answer to these questions. SLPS is set up to collect samples that explore different methods of language processing, different languages under study, different implementation languages and platforms, and different auxiliary implementation technologies (e.g., parser generators). SLPS is a SourceForge project; see the project site and the project web site. The project will hopefully grow into something useful that inspires programmers as much as university teachers. SLPS could become a community effort that collects programs of “a certain kind” – just as much as the sites “99 Bottles of Beer”, “Hello World”, or “OO Shape Examples”. This is a call to arms for users and contributors. I am absolutely motivated and prepared to push this project for some time, but I hope that others see the value and join in. I readily leverage the value of the project in teaching, and consider the teaching value indeed to be the sweet spot of the project. (There may be an idea for a textbook in the air, subject to more thinking.)

 

The Factorial Language (FL)

To begin with, I have uploaded several implementations of the Factorial Language (FL) – a tiny functional programming language with first-order functions over integers. (Lukas Renggli has readily added an FL implementation based on Squeak.) The uploaded language processors parse, pretty-print, evaluate or optimize programs like the following:

mult n m = if (n==0) then 0 else (m + (mult (n - 1) m))

fac n = if (n==0) then 1 else (mult n (fac (n - 1)))

In case you didn’t guess it yet, the language’s name refers to its groundbreaking ability to express the factorial function. I have been told that the Fibonacci and the Ackermann functions can be expressed, too. This may be entirely possible. In the above snippet, we define both a mult and a fac function because there is no built-in multiplication; mult is derived from built-in addition.

 

ANTLR, AspectJ, C#, Haskell, Java, Prolog, Smalltalk, Tom, XML, XSD, …

Here are the existing FL implementations, as of writing: (i) a Haskell-based one using parser combinators for parsing, and generic functional programming (Data.Generics) for implementing an optimization; (ii) a Prolog-based one using DCG style for parsing; (iii) a Java-based one using ANTLR to parse concrete syntax, and plain old classes with virtual methods otherwise; (iv) another Java-based one using additionally TOM for implementing several operations by rewriting; (v) yet another Java-based one that bypasses concrete syntax, and uses XML and XSD instead, to which end it uses the JAXB approach for XML/object de-/serialization; the language processing operations are added as virtual methods by AspectJ’s intertype declaration; (vi) a Squeak-based one; (vii) two C#-based ones. I really look forward seeing additional FL implementations or entirely new groups of language processing samples.

We may be able to grow this sample collection up to a point … (i) that it captures and demonstrates most if not all principles and methods of language processing, language-oriented programming, compiler construction, X/O/R mapping, XML- and grammar-based programming, program and software transformation, and, last but not least, formal semantics; (ii) that it showcases, compares, and evaluates popular or interesting language processing technology – such as combinator libraries (for parsing, pretty printing, transformation), DLS infrastructures and frameworks, Metamodeling technologies (e.g., based on EMF), and IDEs; and (iii) that it collects executable knowledge about languages – both the languages under study and the languages used for implementation.

 

Regards,

Ralf

The call for papers for the 1st International Conference on Software Language Engineering (SLE) is public now. I am happy that we did all the work to start this new conference. I am 100% confident in the SLE conference; it’s backed up by a strong and self-confident community that has something to say about generic language technology, programming languages (incl. parsing, semantics, type checking), domain-specific languages, program transformation, software re- and reverse engineering, and model-driven development; it’s also backed up by the ample relevance of this area for the progress of IT, and the insight that the fundamental notion of languages cannot sufficiently be discussed within the scope of “technical spaces” alone (read as grammars, XML, UML, graph transformation, etc.). Have a look at the call. SLE 2008 will be co-located with MODELS 2008, September 29-30 October, in Toulouse, France.

Disclaimer: A subjective perspective on SLE follows. Opinions expressed below are neither necessarily shared by my wife, nor by other SLE organizers and combatants, nor by my current/previous/future employers. Also, don’t take me too serious. Remember this is a blog, I am a Cobol+Haskell nerd, and I don’t use smilies, as a matter of style. As one of my SLE friends (namely, JMF) tends to say: “May the fun be with you.” Indeed, SLE 2008 isn’t just an ambitious conference. We also want to have fun studying languages, being language nerds, comparing artificial and natural languages, engineering denotational semantics, hooking into popular IDEs, hacking parsers and type systems, and bashing XSD. Did I mention, BTW, that we meet up in Toulouse? Toulouse is not Bordeaux (“city of wine”), it’s not Lyon (“city of gastronomy”), but it means quality French wine and food; if you are stationed elsewhere in this world, then get some rest for a few days, and submit an unrejectable paper, or attend anyhow. In fact, I am just reading in a message from JMF: “Toulouse, a `haut lieu de la gastronomie francaise’". Watch your weight.

 

The IT Tower of Babel


Just as much as mankind has suffered from the construction of the (Biblical) Tower of Babel and the subsequent confusion of languages (in fact, tongues), mankind (say, IT) is now suffering from the IT Tower of Babel – say the continuing emergence of always new (artificial) languages without much reliable ability to mediate between or to integrate or to evolve or to replace these languages; without even much shared understanding for the need for all these languages; overall: without much ability to deal with the increasing complexity of the pool of relevant languages in any given IT context.

Is the Tower of Babel a good metaphor? Admittedly, the (Biblical) Tower of Babel was simply an overambitious construction project that justly annoyed the chief architect (not of the building but of the universe) up to the point to punish the constructors with the confusion of languages; other punishments could have been chosen – such as blocking access to further construction material. In contrast, I use the term “IT Tower of Babel” to refer to the plethora of artificial languages as such. This plethora was not caused by Microsoft’s CSA or any other major chief architect, but it’s pretty much a self-punishment of the IT community as a whole. We somehow ended up with all these artificial languages. Technology providers (say for compilers and platforms) certainly had a part in this: variation of language (read as programming language, any sort of tool-dependent file format) is often used for differentiation. Generally, applied computer science (let’s say “IT computer science”) is not quite an engineering science; it is so much affected by individuals, industry, fashion, and fiction. Finally, the emergence of artificial languages is subject to entropy; it seems we are just getting more – no matter what.

Life was easy when there was only the Adamic language; IT life was easy when languages were in the hands of computer scientists (as opposed to language amateurs, not caring about semantics and program algebra); when there were just a few important programming languages: Cobol, C, Fortran, and not much more; when systems didn’t have to interoperate by means of C/S, SOA, WebServices; when there were just a few APIs per platform; when system owners were mostly happy with staying with their (evolving) platform for 1-3 decades. In fact, there were quite a few platforms (think of the mainframe age and the various Cobols; ask your grandpa), but that wasn’t too much of a fundamental problem; it merely provided job security for a small crowd of “interpreters” – specialists who could connect systems on different platforms in those relatively few cases when it was inevitable. (Read “few” in comparison to today’s standards.)

The confusion of natural languages happened instantly, and it was fatal to the further construction of the (Biblical) Tower of Babel; suddenly we had 70-72 dialects – a number that is amazingly small for IT standards. In contrast, the confusion of artificial languages went through some stages, and it’s ongoing.

 

Stage I: “Many programming languages”

The problem with the confusion of natural languages basically was that people lacked translational power for the 70-72 dialects. Lacking this power, the offending building had to be abandoned. There are probably a thousand programming languages in use, certainly if we count dialects and versions, even if we narrow down the number to business-critical languages – say languages used for business-critical software systems – as opposed to say languages for making rhymes about beer. Such a multitude of languages is a somewhat unsolved and serious problem in IT by itself (and not made easier by the continuing increase of languages); think of mass maintenance or language conversion. The problem can be economically fatal for a system owner; lack of transformational and conversional power makes it impossible to keep up with inevitable evolution pressure.

Now, you might say: “Lack of transformational and conversional power as in mass maintenance? This sounds like a Cobol problem to me. Cobol is for wimps; I am using Java or .NET, look at my beautiful refactoring support, lock at how I can patch this/that compiler to give me (low-level) access to the parse-tree of a program.”

Then try this: develop a converter from Java 1.1 to C# 3.0 applications; please send me an email, every now and then, how it goes. If you are among the top 10 of computer scientists worldwide, or an extraterrestrial hacker, you might get this done before Isabelle enters school, and you will sell it for more money than is needed to stop global warming locally in your county. That’s the point: shouldn’t we all work on making these things easy and repeatable and doable by folks with just one PhD? Let’s have 10 years of the new SLE conference, and see whether we can get closer to this goal!

Want a different challenge? Then try this one: go and patch Squeak or any other major Smalltalk dialect so that it gets a decent amount of static checking, in fact, up to the degree that you would get in an OO functional language with type inference, modulo the necessary compromises to account for Smalltalk’s breathtaking expressiveness, most likely on the grounds of soft typing. Providing static typing for Smalltalk has been attempted a number of times (see Strongtalk, for example), but it hasn’t been successful so far. Smalltalkers would absolutely appreciate additional checks, as long as they don’t get into the way, as long as they do not remove expressiveness. I think that this is an extremely laudable attitude, and we just must get better software-language engineers to solve this problem, ultimately. In more general terms, Erik Meijer uses the slogan “Static Typing Where Possible, Dynamic Typing When Needed”, and, of course, the assumption is that we extend our horizon as to what we call a type, as to the perhaps old-fashioned dichotomy of static types, dynamically checked assertions, and offline-proven program properties.

 

Stage II: “Complex platforms”

 

Even if you are a wizard of the C#, Java or VB syntax and semantics, you cannot write much else than “hello world” programs and the factorial function. In fact, “hello world” already requires some “library” knowledge.  More generally, nobody can get real work done without profound knowledge of a platform. I want to make the point here that a platform is actually an overweight cocktail of languages: API signatures, API usage protocols, make/build/configuration descriptions, ontologies for semantic annotation (“custom attributes”), …, you name it. Further, the use of a typical platform is tied to the use of different data models in the average application: POJO/POCO, not so plain objects, a bit of relational model, and a bit of XML. (Defining schemes according to different data models, mapping between specific schemes, and mapping generically between different data models – these activities all count as software language engineering, in case, you didn’t notice.) Moreover, as a “platform user”, you use different type systems: Java 1.1, Java 1.5, C# ?.?, VB.NET ?.?, SQL, XSD, … – type systems are languages, too. Finally, you also use systems of designs patterns – again, these are languages.

Compare this to good old Cobol times, when you had an all-inclusive language (“hello world” works with the DISPLAY statement; data access works with the CRUD statements), and some pretty lean (and optional) tools for forms, reports, and structured programming. The data model of hierarchical indexed and sequential files was deeply embedded into Cobol. Can’t we just fix Cobol? It already has OO. We know how to do aspects in Cobol; even though the extension got accidentally obsoleted some decades ago. Let’s re-enable aspects, add a few more things, and then convert back to Cobol, worldwide. I’ll try to get the Haskell community aboard.

If we do not convert back to Cobol, we need to deeply understand platforms as cocktails of languages. You might say, “I don’t care much that these are languages; I just have to handle this complexity, call it language, call it software development reality. Again, I don’t care.” That’s the point: nobody cares, and the various languages are disintegrated; some of them are obfuscated; many of them are compromised. Also, they deadlock each other’s evolution and clean-up. Indeed, backwards compatibility deadlocks everything. I have a dream: I imagine an (IT) world where the notion of backwards compatibility has stopped to frighten us; evolution is just so effective that we can always factor in ways that we see a system and a platform that represents the optimum in design conceivable by the time. (Perhaps, the Smalltalk folks can do it? Come and talk at SLE 2008…!)

Again, we should be able to effectively refactor platforms. First example: concepts that were previously in the build rules get suddenly deeply embedded into the language, or disappear from the programming surface because new kinds of compilers automate the concept. Second example: How many XML APIs are there for Java? My bet is 42. Do we need all of them? Shouldn’t we be able to retire half of them or more? Should a Java program in 5 years from now even care about the dichotomy “in-memory” vs. “push/pull”? Third example: if suddenly XSD is found to be too complicated, can we just replace it all over our apps by the use of say RELAX NG? (Some XML bashers may even want to stretch this scenario, and transform away all XML in their apps. Now, let’s not overdo it, Ok?) Fourth example: whenever a major language extension makes to our favorite programming language, we would like to retire design patterns that mimicked the same concepts previously, and we would like to do mining so that lower-level program structures are semi-automatically lifted to the full, extended language. We have seen research on instances of this idea such as in aspect mining or refactoring for generics. How can we make this idea more repeatable? How can we define the higher-level language concepts in such a way in the first place that their relationships to design patterns or lower-level program structures follow from the language definition, and do not require heartbreaking proofs, after the fact? Overall, shouldn’t we limit the entropy in IT? Otherwise, suddenly 50% percent of all human beings may end up working as programmers. Mark my words!

 

Stage III: “Complex interfaces”

 

Even if you are a wizard of the .NET or Java platform, and have a Master’s in say financial services or “foundations” of eCommerce, you will still not be able to do a “MyAmazon” or “MyEBay” because this sort of application must interface with a good deal of other systems in quite complicated ways. First, the sheer size and the sheer number of interfaces that must be contented by the advanced web application are challenging. (For instance, have a look at FpML – a sized (markup) language for financial products.) Second, the interfaces (or services) come with fundamentally different protocols and kinds of contracts (e.g., synchronous, asynchronous, reliable, occasionally connected, self-persisting, encrypted, session-based, etc., sigh!). Third, evolution is the straw to break the camel’s back. Software evolution, in the presence of complex platforms and complex interfaces is currently infeasible in theory, even though it occasionally works in practice – often by not admitting evolution. No matter whether it works or not, software evolution triggers interesting, additional artificial languages, e.g., transformation languages, and model/metamodels for reverse engineering.

I guess WebServices with WSDL and friends was a first attempt to solve this problem. Does it work? Not for me! Whenever I see a .wsdl file, I am overwhelmed; I look somewhere else. I guess SOA is the current hype way of addressing the problem. In both cases, the foremost emphasis is on service description (based on some sort of signatures and protocols, again languages). In both cases, some efforts address the discovery of services. In neither case, I am aware of a comprehensive approach to service evolution with all the encompassing problems of co-evolution and versioning. I really hope that the WebService/SOA community feels somewhat represented by and interested in SLE. I am sure that SLE will attract experts in programming languages, program transformation, language design, and programming environments; this sort of expertise should go well with the challenges in WebService/SOA. Let’s benefit from each other’s interests and strengths! Let’s meet at SLE 2008 in Toulouse.

 

Stage IV: “Driving models

 

Model-driven engineering/development is partially meant as a solution to some of the problems mentioned in the descriptions of earlier stages. We should ignore the fact that MDE/MDD was initially touted as “vacuous fluff” on the OhMyGosh site, aka MDA. In fact, the idea of generating implementational artifacts from higher-level specifications or “models” is laudable and pretty established in CS, so MDA didn’t quite start the salvation of mankind; here are some distant references, [1], [2], [3], [4], [5], but I am not always sure what “model-driven” means – so these related work pointers are given without any underlying principle of what’s in and out. Nevertheless, I mention them here for the benefit of some of the disoriented PhD students that I occasionally encounter at conferences.

Perhaps my stupid problem with model-driven foo/bar is the following: Yes, sure, if … [fun_level++] … I start from a language setup with only brain-dead abstraction mechanisms, don’t care about the fundamental notions of refinement, translation, composition, partial evaluation, drive my efforts by the ideas of boxes, arrows, and stereotypes, add an amazingly complex constraint mechanism (like I would add the wings of an air shuttle to say a Trabant), invent a dozen notations for state machines of different kinds, throw hundreds of people at standardizing this Godzilla, … [keep on rambling for some time] …, then, yes, I can fly deliver very expressionistic pieces of art (read as UML diagrams) meant as solutions for problems of even the smallest size, differing only in the degree of required masochistic orientation. [fun_level--] (I apologize for this imperative idiom, but it’s just too convenient.)

Anyway, if we just think of it as program refinement, translation, composition, partial evaluation – then it all makes sense. So we just call it model-driven engineering, and try not to get stuck in bad memories of early MDA. No matter what, model-driven engineering is a software-language engineering challenge per excellence. We get additional languages (e.g., the language that is refined to be able to express distribution behavior after refactoring or refinement; think of POPL-powered explanation of the GWT); we get additional operations on languages (other than the obvious modes of compilation and interpretation). This creates (interesting) consistency challenges of a magnitude. Languages that arise due to well-defined refinement or translation are hard to argue away; so here I take the position that we better cope with them, make it easy to endorse them, and fully support their life cycle.

 

Stage V: “Languages as underwear”


Toddlers develop their own natural language, and parents are motivated to catch up and understand them, sometimes imitate them. When friends come for dinner, parents even volunteer as interpreters, and everyone is happy and enjoys the individual language. It also makes sense because a full-blown dictionary and grammar is way too complex and inappropriate for the toddler’s ability and mind and focus.  Also, parents do parenting quite a bit for the entertainment value of toddlers, and those individual languages are part of the entertainment value, no? Also, the time frame for using these languages isn’t completely insane; the toddler is going to change diapers way more often than inventing a new baby language.

In the emerging age of domain-specific languages, or should I say, in the current, more intensive wave of the long-standing DSL movement, we are suddenly able to change our artificial languages like underwear; introduce a new one, every day; use one per component of the application! To me, it is not strikingly obvious that this works, unless we impose appropriate constraints on this ability. (Come to SLE 2008, and tell us, how it works!) Taken to an extreme, “languages as underwear” would correspond to an adult who invents a new (natural) language every day or so, while assuming his colleagues, friends, and partner to catch up and learn this language pretty quickly. Perhaps, the adult is actually teaming up with a few buddies to justify the creation of the new language. Still, imagine everyone being so creative continuously. Let me emphasize that the creation of a special-purpose dictionary, or, let’s say, an ontology (to impose more structure on a dictionary) is Ok. I wonder … what’s fundamentally wrong with say English or German given these languages’ expressiveness? Getting back to artificial languages, what’s wrong with say Haskell 98+, provided we are getting depending typing, clever macros, and a few more gadgets. Then we will be “DSL complete” (as opposed to Turing complete).

Here is a pretty non-controversial definition of the term “domain-specific language”: it’s a language with domain-specific support for notation, abstraction mechanisms, checks, and behaviors (e.g., efficient behaviors based on domain-specific transformations). This sounds good. Sure, we do want the capability to define domain-specific languages. However, the dominating discussion of DSLs lacks some related, very important aspects. Suppose we are close to the point that it is easy to define a DSL, don’t we also need the ability to evolve them, integrate them, retire them, objectively measure the impact of introducing them, or analyze their usage and provide feedback for their evolution? Here I note that not even the mere definition of DSLs is a done deal, unless you are fan of attribute grammars, dependent typing, and a few more weapons of math-based distraction. It is relatively established to provide domain-specific notations, e.g., visual frontends on top of an object model; it is still relatively difficult to provide domain-specific checks, behaviors, and to support DSL evolution, again, unless you are a cutting-edge language nerd – to be clear and serious, I don’t consider myself nearly to be such an expert.

 

Conclusion

 

When the confusion of natural languages happened, it was at least clear that everyone was still talking in some language, even though people would no longer understand each other – too bad. Again, there wasn’t much dispute as to what a language is. In CS, we have taken the confusion of languages to a new level. We are so overwhelmed by language-like entities so that we do not simply lack the ability to understand the various languages at hand; worse than that – we do not even realize when “someone is talking”; we do not even “see” the language when it’s there. Also, we are caught up by details of rendering – comparable to someone who has no ability to deal with even the lowest degree of a dialect. It may also be that we are too openly endorsing slang. By this I mean that the SLE-related areas of CS and IT could perhaps make better use of (and constructively challenge) the fundamental insights accumulated in parsing, formal semantics, programming languages, program transformation, testing, verification, literate programming, and general automated software engineering.

We will inevitably see more languages (programming languages, APIs, DSLs, metalevel/transformation languages, ontologies, …). Understanding the software life cycle of all these languages is a central challenge for the CS discipline. We must improve our abilities to retire languages when they are no longer needed, objectively determine when a domain-specific language is worth designing, provide higher abstraction levels in programming, validate and verify language definitions or implementations, modularize them, think beyond the single-language frame, and endorse language integration and composition, … [ please check out the call for papers for SLE 2008 ] ….

 

Ralf Lämmel

Wannabe Software Language Engineer

More Posts Next page »


 
Page view tracker