JScript, DNA, and Mad Cow Disease

JScript, DNA, and Mad Cow Disease

  • Comments 27

The WebGraphics blog recently had a Javascript quine contest that just ended. I'm totally bummed that I didn't hear about this until it was over. 

What's a "quine"?  A quine is a program which, when run, produces the source code of the program as its output.  The term is an homage to the late philosopher Willard Van Orman Quine, and was coined by Douglas Hofstadter.

There's really not much of a practical upshot to writing quines, but depending on the language you pick they can be quite challenging.  Here's an introductory paper on some of the challenges in writing quines in C.

One twist that actually makes this contest a little bit easier is that since JScript doesn't have a built-in "print out a string" function, the rules of the game state that to be a successful quine, the last expression evaluated by the program must be a string containing the program text.  This also seems "purer" to me; the JScript built-in functions are a documented, guaranteed-to-be-there part of the language, whereas printf and other C runtime library functions used to do the heavy lifting are not part of the language itself, merely commonly used standard runtime libraries.

The submitted solutions are quite inventive, using everything from unescape to regular expressions, to reversed strings. Something that I find interesting about the submitted solutions to this problem (see the link above) is that though they use a number of clever and inventive techniques, they all basically come down to the same technique that a cell uses to reproduce.

From a simplified, abstract reproductive standpoint a cell consists of two parts: the DNA blueprint which encodes instructions for building things out of proteins, and a factory which manufactures whatever the blueprint says. The cell makes a copy of the blueprint, builds whatever the blueprint says to build, and embeds the copy of the blueprint in the built structure. And since the blueprint in a cell describes how to build an identical cell, this process can continue on in the daughter cell. (If the blueprint does not describe how to build an identical cell, that might be because the cell has been infected by a virus which has hijacked the factory to make copies of the virus instead of the cell.  Or, perhaps the DNA has mutated, or the factory is damaged in some way.)

Similarly, these programs all have a "DNA" blueprint string that describes the structure of something, they have a universal constructor which constructs a copy of the blueprint plus whatever the blueprint says to construct, and they have a blueprint which describes the universal constructor itself. Some of the constructors are more universal than others -- some really could construct any old string, some have been specifically tuned to solve the quine problem, but the basic idea is the same throughout.

Here's the most straightforward and most universal example of what I mean:

a='b="a=%27"+a+"%27;"+unescape(a)';b="a='"+a+"';"+unescape(a)

Notice how the program clearly separates the static blueprint statement from the dynamic factory statement.  The blueprint just contains a pattern which the factory can use to build a string.  The factory produces a copy of the blueprint and then executes the blueprint to produce the result, and appends it to the blueprint.  Just like a cell produces a copy of the DNA and then builds a new cell to wrap around it.  This program could produce almost any output given a suitable blueprint; it just happens to have a blueprint for itself.

To continue this analogy, I like to think of cheater quines as prions.  Prions are the protein structures that cause Mad Cow Disease and related diseases. Unlike viruses and bacteria, prions contain no DNA blueprint whatsoever.  Rather, they replicate themselves in a really sneaky way.  A prion is simply a protein that is a variation on the shape of a "normal" protein often found in animal's brains.  But it is a very special shape -- it is one that has the property that when it touches a protein of the "normal" shape, it twists the normal protein into the prion shape, thereby replicating itself.  It's like if you had a special spoon of such a shape that whenever it touched a fork, the fork turned into a spoon. Throw that thing into a drawer full of forks, and pretty soon you'll have no forks left.

The thing about prions is that they are completely "implementation detail dependent".  A cell is a universal constructor -- it can build almost anything out of proteins, from virii to rhinoceri.  You can change the structure of the blueprint and the factory will cheerfully build whatever the new blueprint specifies.  Prions can't build anything and do not contain instructions to build anything.  They're just shapes that happen to have the weird property that they can turn things that already sort of look like themselves into things that look exactly like themselves.  "Prion" quines are like that too -- they work by taking advantage of implementation details of the surrounding environment to reproduce.

UPDATE:

Here are two challenges. 

Challenge #1:

What's the shortest JScript "pure" quine you can come up with?  That is, a quine that does not use any library functions.  (unescape, eval, regexp functions, charCodeAt, etc.)

Here's my entry -- well, not exactly.  I've added carriage returns, spaces and comments to make it readable.  Without the CRs and comments it's a 240 character quine.

// avoid having to escape quotes:
var q='"';
// DNA blueprint section
var b=[
"var s=",
q,
"var q='",
q,
"+q+",
q,
"';var b=[",
q,
";for(i in b)s+=b[i]==q?'q,':'",
q,
"'+b[i]+'",
q,
",';s+='];';for(i in b)s+=b[i];",
];
// reproduce the header
var s="var q='"+q+"';var b=[";
// reproduce the blueprint
for(i in b)
  s+=b[i]==q ?
    'q,' :
    '"'+b[i]+'",';
s+='];';
// run the factory on the blueprint
for(i in b)
 
s+=b[i];

Challenge #2:

My program above, since it has those comments and CRs, is not a quite a quine. But it's last expression evaluated is a 240 character quine.  Let's call a program that is almost a quine in this manner a "quine uncle".  Can you find a JScript quine uncle which is SHORTER than the resulting quine?  How short can you make it?

  • I almost set fire to my brain trying to write one of these once. In fact, Douglas Hofstadter has a lot to answer for in terms of nearly casuing me irreversible brain-damage; when I read GEB I tried to write a couple of quining programs, attempting to implement his bloop/floop/gloop languages, spent weeks constructing bizarre reversible dialogs, and generally had the most pyschedelic experience this side of a blotter...

    Metaphysics, cognitive psychology, humour, absurdity... I'd say that GEB is among the greatest books ever written.
  • Oh, and did you intend everything after the line of code to look like it does?
  • Looks fine on my box. What does it look like to you?
  • Evil things to do with quines:
    http://www.acm.org/classics/sep95/
  • Heh. I don't have to look far to find a pure quine that is smaller than that. If you look at the entries in the JavaScript Quine Contest, you'll find one at 193 and the same one cut a few characters down to 181 characters in the submissions by Patrick H. Lauke. However, those have internal redundancies. As you might note, all actions within the function wrapper could as well take place without a function wrapper, generating the following 107 characters "pure" quine, which uses only a single expression statement, three assignment expressions and fifteen string concatenations.

    q='\'',s='\\',a='"q="+q+s+q+q+",s="+q+s+s+q+",a="+q+a+q+","+a',"q="+q+s+q+q+",s="+q+s+s+q+",a="+q+a+q+","+a

    Of course, if you want to create a "Quine Uncle" from it, you can easily split it on separate lines and comment it up.

    // Assign single quote to q
    q='\'',
    // Assign reverse solidus to s
    s='\\',
    // Template using q and s
    a='"q="+q+s+q+q+",s="+q+s+s+q+",a="+q+a+q+","+a',
    // And finally, the factory
    "q="+q+s+q+q+",s="+q+s+s+q+",a="+q+a+q+","+a

    I actually have a hard time finding anything I could cut away in that one, considering the limitation to only use operators for processing. (Which is just the restriction you have put on it by demanding that we use no built in function.)

    Sure, maybe a loop can cut it. In SpiderMonkey we could maybe use a direct call on a regular expression* to get around that limitation, but I assume you want us to use only core ECMAScript. I would, were I you.


    * in SpiderMonkey, the [[call]] member of a regex is an alias for the exec member. Thus, you can do the following:

    /miss/i('Mississippi') //=> 'Miss'
  • Is there any reason why

    a= function () { return "a= " + a + ";a()"; } ;a()

    isn't a quine? I'd assume so, 'cos it seems like the most obvious thing to do.
  • Awesome. Can we get smaller than 107?

    And what about my tiny quine uncle challenge? Obviously it's easy to turn any quine into a longer quine uncle by just adding spaces and comments, but can you turn a quine into a quine uncle by taking stuff away?

    The obvious way would be to build a quine that deliberately has extra whitespace, and then take the whitespace away in the quine uncle. That's lame. Are there more clever ways to do it?
  • > a= function () { return "a= " + a + ";a()"; } ;a()

    I'd accept that, but liorean explicitly disallowed the decompilation operator in his contest.
  • Ack! The fiend. The moral of the story is always read the rules.
  • I think I have a really good reason for that limitation,. but it might demand some context. I would like to revise the Prion simile that Eric used for cheating quines. To remain in the world of cellular biology I would instead like to do a direct simile between cheating quines and virion. Virons consists of a core nucleotide acid string (the template) and shielding container (the delimiters) and may also contain some partial factory features. However, these virions need to get inside of a cell (the runtime environment of a scripting engine) and with the help of the cellular proteins (full or partial factories) they multiply the nucleotide acid string (template) and use it to create the delimiters (container) to finally form copies of the original virion. This is what a cheating quine is. And in the same way that reading the source from the file system or the memory is to use the metaphorical cellular machinery, so is using the built in deparsers.

    Anyway I have another, this time practical, reason for not using the internal deparsers. This one could be summed up as: implementation specific. For the first, the string delimiters used for string representations are not standardised. Second, neither are whitespace, newlines and statement terminators. Third, I know that Netscape's deparser has changed among others the array literal notation. Also, I know some Opera versions didn't produce any string delimiters when deparsing.
  • This is a quine uncle at 128 characters that generates the 107 characters pure quine above. Every character in it is as far as I can see necessary.

    S='q2100,s2110,a280,8';for(i=9;i--;)S=S.replace(eval('/'+i+'/g'),'\'9\\9=09+q9+s9+a9="39+",9"q64337s64437a6537"5'.split(9)[i]);S
  • > printf and other C runtime library functions
    > used to do the heavy lifting are not part of
    > the language itself, merely commonly used
    > standard runtime libraries.

    Since the time of the first standard (1989 in the US and 1990 worldwide), there have essentially been two kinds of C. printf() and a ton of other functions are guaranteed to be provided by hosted implementations, though they are optional in freestanding implementations. Most development environments offered by your employer, and most applications built with them, involve hosted implementations. The same phenomenon occurs in Linux, VMS, mainframes, etc. Hosted implementations are still C.

    Meanwhile, one year the IOCCC operators asserted that one winner was the smallest possible self-reproducing program. In my opinion it should not have been a winner for two reasons: (1) it was not obfuscated in the least, and (2) it violated the ISO standard in at least two ways. However, if fed to a selected implementation, it could be self-reproducing.
  • And this one is at 89 characters, removing much of the replacement machinerin in the one above, only making the replacement with the biggest reduction. (Also note how neatly it removed all the double quotes from within the single quoted string, allowing me to wrap all the single quoted strings in a double quoted string.)

    S="q='\\'',s='\\\\',a='_',_".replace(/_/g,'"q="+q+s+q+q+",s="+q+s+s+q+",a="+q+a+q+","+a')
  • (And you can drop "S=" from it for a win of two characters...)
  • Indeed, i smell certain "practical upshot" in a self-reproductive design.. Recently was making a simple [Feedback] script, contained in a html field of every message of our [Outlook-looking] proj. tracking app.. in the end it was enough to say:

    document.write(document.documentElement.outerHTML);

    (oh, if only it was a JS!..)
Page 1 of 2 (27 items) 12