c-datadec.man 5.01 KB
Newer Older
css1dw's avatar
css1dw committed
1
2
.TH DATADEC L
.SH NAME
css1dw's avatar
css1dw committed
3
datadec \- ANSI C data declaration module constructor
css1dw's avatar
css1dw committed
4
5
6
7
8
9
10
11
12
13
.SH SYNOPSIS
datadec
.RB [\- vno ]
.I basename
.RB [infile]
.SH DESCRIPTION
.B Datadec
takes an input file - or stdin if no input file is given -
containing a series of HOPE/Miranda style recursive data declarations
with optional hints on printing, 
css1dw's avatar
css1dw committed
14
and builds an definition/implementation pair of ANSI C files \-
css1dw's avatar
css1dw committed
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
.I "basename.h"
and
.I "basename.c"
containing data declarations,
constructor functions, deconstructor functions and printing functions.

.PP
The two files produced together form a module that implements the relevent
data types.

.SH "OPTIONS"
.TP 8
.B "\-v"
enter verbose mode.
.I "Datadec"
now displays the data types that it parses, along with various almost
certainly useless bits of information about optimization.
.TP
.B "\-n"
do not perform various optimizations.
.TP
.B "\-o"
perform optimizations (the default).

.SH "AN EXAMPLE"
.PP
The simplest use is to prepare an input file, such as
.I "data.in,"
which might (for example) contain:
.nf
TYPE {
	IntList =  Null or Cons( int first, IntList next );
	ILList  =  Null or Cons( IntList first, ILList next );
	IdTree  =  Leaf( string id )
		or Node( IdTree left, IdTree right );
}
.fi
To generate C code implementing these types, invoke:
.nf
     datadec eek data.in
.fi
which generates
.I "eek.h"
and
.I "eek.c"

.SH THE DATA DECLARATION LANGUAGE

The language accepted by datadec is split into two components:
the "outer language" is patterned after
the GMD compiler tools
.B "LALR"
and
.B "REX"
(similar to Yacc and Lex)
and allows you to specify four sections (only the last is compulsory):

.PP
.nf
.B "[ EXPORT { free_format_text } ]"
.br
.B "[ GLOBAL { free_format_text } ]"
.br
.B "[ BEGIN { free_format_text } ]"
.br
.B "TYPE { types }"
.fi

.PP
The contents of the
.I "export"
section are placed in the header file (the .h).
Commonly, you may wish to add extern function declarations, public types and
external variable declarations
which must be
placed at the top of the header file, and also define some additional
procedures using the automatically generated types which must be placed after
the type declarations!
To achieve this, you should place a `@@' in the export section -- the text up
to that point is placed at the top of the header file, whereas the text
after it is placed at the bottom of the header file -- after all the types
have been defined.

.PP
Similarly, the contents of the 
.I "global"
section are placed in the C file,
again with `@@' being used to split the global section into "top of file" and
"bottom at file" pieces.

.PP
Similarly, the contents of the 
.I "begin"
section are placed in an initialization procedure, which the user of the
constructed module must remember to call at an appropriate juncture (eg.
immediately when main starts).

.PP
The
.I "types"
section contains the type declarations themselves - the inner language.

.PP
The "inner language" - that of specifying the actual types section -
is closely modelled on Miranda or Hope, with printing rules added.
Here is the grammar:

.PP
.nf
types	= list(type)
.br
type 	= type_name '=' shape list( 'or' shape ) ';'
.br
shape	= constructor_name [ '(' params ') ] [ print ]
.br
params	= param list( ',' param)
.br
param	= type_name param_name
.br
print	= list(element)
.br
element	= number | string_literal
.fi

.PP
Note that each data type is terminated by a semicolon,
and that (within one data type) each shape is separated from the next by 'or'
(just like the '|' in Miranda).
If a particular shape has parameters, they are separated from each other
by commas.
Each type name is simply an identifier.

.PP
.I "Datadec"
also generates routines to write each type to an open FILE *.
The method of printing each shape is governed by the presence or absence
of a print rule.  If no print rule is given, the constructor name is printed,
and then each parameter is written out using the appropriate print routine.

.PP
If a print rule is given, each print element
(these are syntactically separated by whitespace)
is used to generate the write routine as follows:
A literal string will simply be printed
(well, '\\n' is turned into a newline!),
whereas a number (eg. 4) means that the
4th parameter is printed (invoking the print function for that routine).

.PP
For example, we could augment the
.I "IdTree"
type from the example given above with print rules:

.nf
TYPE {
IdTree  = Leaf( string id )			"leaf(" 1 ")"
	or Node( IdTree left, IdTree right )	"node(" 1 ",\\n" 2 ")";
}
.fi

.PP
Now, an IdTree constructed as
.nf
Node( Leaf( "hello" ), Node( Leaf( "there" ) ) )
.fi
would print as:
.nf
node(leaf("hello"),
.br
node(leaf("there")))
.fi

.SH SEE ALSO
.nf
LALR, REX, Miranda Language Definition.
.fi

.SH BUGS
Some single letter typenames (eg. "f" or "p") could clash with internal
parameter names in the print routines, leading to syntax errors when you
compile the files generated by datadec.
.PP
Someday I'll get it to free up the types too!
.PP
And, finally, one day I'll have to write the C++ and Java versions :-)

.SH "AUTHOR"
Duncan C. White, D.White@surrey.ac.uk.