--> Self Reproducing Programs

Self Reproducing Programs

Thompson proposed the following programming exercise:

He gave an example of a C program, which did reproduce all of it's language tokens, but which did not maintain the whitespace.

As a "purist", I require a few additional conditions:

  1. The program's output must be an exact copy of the program's source file, including any whitespace and line separator characters (to be tested by a byte-to-byte, binary file compare).
  2. All lines in the program source file must end at or before column 80.
  3. All lines (also the last) must end with a newline character.
  4. The program shall not depend on the character set encoding, especially that of the newline character.
  5. The program must compile without any warnings.

Without these conditions, the programs could be considerably shorter and simpler in many langugages, but it would also be more difficult to read them on a screen or to print them.

In most languages, the 2nd condition forces the program into more than one line. Some languages will need more than one line anyway. By the 2nd condition, these languages are no longer handicapped so much.

The goal of the contest is to minimize the number of lines and language tokens, while still maintaining some readability. Two versions which differ only by the number of spaces or by the length of variable names are considered equal.

My little collection currently contains versions for

Progr. Language# of linesAuthor
C 6 Georg Fischer
Java 11 Georg Fischer
Pascal 9 Georg Fischer
Perl 2 ?
REXX (1) Franz-Josef Gasper

There also is a makefile for easy compilation and binary comparision of the source files and their output.

I encourage all programmers to email me different, preferrably shorter versions, hopefully also in other programming languages.

Further reading

C

This version is close to Thompsons's and contains a string with the program text, which is "folded" in a specific way. The program it interpretes every character in the string and prints it, possible by an escape sequence. The escape character (backslash) must be duplicated, sometimes even twice. Conditional expressions (cond?then:else) instead of a switch reduce the size by 2 lines.

char s[]="\";int main(){int i;char t[2]=\" \";printf(\"char s[]=\\\"\");\n\
for(i=0;s[i];i++){*t=s[i];printf(*t=='\\n'?\"\\\\n\\\\\\n\":(*t=='\\\\'?\n\
\"\\\\\\\\\":(*t=='\\\"'?\"\\\\\\\"\":t)));}printf(s);return 0;}\n\
";int main(){int i;char t[2]=" ";printf("char s[]=\"");
for(i=0;s[i];i++){*t=s[i];printf(*t=='\n'?"\\n\\\n":(*t=='\\'?
"\\\\":(*t=='\"'?"\\\"":t)));}printf(s);return 0;}

(Turbo-) Pascal

This program is a translation of the C version. Conditional expressions cannot be used. Some escaping is avoided by the constant "q", and instead of a newline character an array of strings is used for the individual lines.

program r;var i,j:integer;const a:array[1..5]of string=(
''''');const q:string='''''''';procedure p(s:string);begin write(q);for i:=1',
'to length(s)do if s[i]=q then write(q+q)else write(s[i]);writeln(q+'','')end',
';begin writeln(''program r;var i,j:integer;const a:array[1..5]of string=(''',
');for j:=1 to 4 do p(a[j]);for j:=1 to 4 do writeln(a[j]);end.',
'');const q:string='''';procedure p(s:string);begin write(q);for i:=1
to length(s)do if s[i]=q then write(q+q)else write(s[i]);writeln(q+',')end
;begin writeln('program r;var i,j:integer;const a:array[1..5]of string=('
);for j:=1 to 4 do p(a[j]);for j:=1 to 4 do writeln(a[j]);end.

Java

The Java program is a translation of the Pascal version. It also uses an array of strings for the lines of the essential program text.

class r{static int i,j;static String s[]={
"\"\"};public static void main(String args[]){System.out.println(\"class r{\"",
"+\"static int i,j;static String s[]={\");for(j=0;j<5;j++){System.out.print(",
"'\\\"');for(i=0;s[j].length()>i;i++){char c=s[j].charAt(i);System.out.print(",
"c=='\\\\'?\"\\\\\\\\\":(c=='\\\"'?\"\\\\\\\"\":s[j].substring(i,i+1)));}",
"System.out.println(\"\\\",\");}for(j=0;j<5;j++)System.out.println(s[j]);}}",
""};public static void main(String args[]){System.out.println("class r{"
+"static int i,j;static String s[]={");for(j=0;j<5;j++){System.out.print(
'\"');for(i=0;s[j].length()>i;i++){char c=s[j].charAt(i);System.out.print(
c=='\\'?"\\\\":(c=='\"'?"\\\"":s[j].substring(i,i+1)));}
System.out.println("\",");}for(j=0;j<5;j++)System.out.println(s[j]);}}

Perl

Of course, Perl allows for a very short version:

$r='\'; $_=$r; s/([\\\'\\\\])/\\\\$1/g; print \'$r=\\\'\'.$_.$r;
'; $_=$r; s/([\'\\])/\\$1/g; print '$r=\''.$_.$r

REXX

REXX is a scripting (interpreted) language developped at IBM. It was influenced by CLIST and PL/1.

Franz-Josef Gasper's one-liner (118 bytes) does not meet the 80 column condition, but it uses a rather unique approach:

say delstr(delstr(delstr(copies('say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)',4),34,24),67,56),92,32)

For better understanding (and for those without a REXX interpreter) a trace follows:

after copies(...,4):
                                     1         2         3             1         2
                            12345678901234567890123456789012_34+3456789012345678901234
  say delstr(delstr(delstr(                                   xxxxxxxxxxxxxxxxxxxxxxxx  ,34,24),67,56),92,32)
                           'say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)'
  
after delstr(...,34,24):
                              1         2         3          4         5         6                1         2         3         4         5     
                     12345678901234567890123456789012_345678901234567890123456789012345_67+345678901234567890123456789012345678901234567890123456
  say delstr(delstr(                                                                     xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  ,67,56),92,32)
                    'say delstr(delstr(delstr(copies(''say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)'
  
after delstr(...,67,56):
                       1         2         3          4         5         6          7         8         9           1         2         3  
              12345678901234567890123456789012_345678901234567890123456789012345_6_78901234567890123456789012+345678901234567890123456789012
  say delstr(                                                                                               xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  ,92,32)
             'say delstr(delstr(delstr(copies(''say delstr(delstr(delstr(copies('''',4),34,24),67,56),92,32)say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)'
  
after delstr(...,92,32):
  say 'say delstr(delstr(delstr(copies(''say delstr(delstr(delstr(copies('''',4),34,24),67,56),92,32)'',4),34,24),67,56),92,32)'

which prints:
  say delstr(delstr(delstr(copies('say delstr(delstr(delstr(copies('',4),34,24),67,56),92,32)',4),34,24),67,56),92,32)

makefile

With an appropriate makefile you may compile, link and test the programs and binary compare the source files with their output.

Last modification: 2012-07-22
Please email any questions and additions to:
<punctum@punctum.com> Dr. Georg Fischer