Commit 0a888838 authored by Duncan White's avatar Duncan White

some major improvements. changed to getopt() for flag processing, added -s...

some major improvements. changed to getopt() for flag processing, added -s FUNC suppression set functionality so that you can suppress the automatically generated print_TYPE function (because you've written your own tweaked version)..
parent 93b36a9a
......@@ -13,8 +13,8 @@ CFLAGS = -g -UDEBUGGING -Wall
LDLIBS =
EXECS = datadec
datadec_srcs = datadec.c parser.c lexer.c struct.c decs.c optimize.c
datadec_objs = datadec.o parser.o lexer.o struct.o decs.o optimize.o
datadec_srcs = datadec.c parser.c lexer.c struct.c decs.c optimize.c set.c
datadec_objs = datadec.o parser.o lexer.o struct.o decs.o optimize.o set.o
all: $(EXECS)
......@@ -30,9 +30,10 @@ install: $(EXECS)
datadec: $(datadec_objs)
datadec.o: struct.h lexer.h parser.h decs.h optimize.h
datadec.o: struct.h lexer.h parser.h decs.h optimize.h set.h
decs.o: struct.h decs.h
lexer.o: struct.h lexer.h
optimize.o: optimize.h struct.h
parser.o: struct.h lexer.h parser.h
struct.o: struct.h
set.o: set.h
......@@ -7,14 +7,33 @@ code to implement them.
Duncan C. White, d.white@imperial.ac.uk
19th March 2002
New May 2014! experimental free functions for every inductive data type
(run datadec with new -f option)
New June 2018! converted to use stdbool.h at last, and added new code to
quietly write out a .basename.dd file listing all the types and shapes,
and for each shape, the parameter types. This will be useful for add-on
tools such as the experimental "CPM.perl" script, which tackles client
side use of datadec-generated types, translating C+Pattern Matches to C.
New May 2014! v1.2
- experimental free functions for every inductive data type
(run datadec with new -f option)
New June 2018! v1.3
- converted to use stdbool.h at last, inside and out
- added new mode (-m) and new code to emit on stdout a list of all the
metadata, i.e. for each shape of each type, typename, shapename, and
the shape parameter types. This will be useful for add-on tools such
as the experimental "CPM.perl" script, which tackles client side use
of datadec-generated types, translating C+Pattern Matches to C.
- added assert( p != NULL ) in every constructor after the NEW( T ) call.
- fixed longstanding bug whereby the top of the global section was
copied AFTER the <<#include "thismodule.h">> whereas if it were copied
above it could #include other stuff. but of course now it can't use
the defined types from "thismodule.h".
- changed the arg processing to use getopt.
- added a new "suppress function F" multi-flag, and stored the results in
a "list (could even be a set) of strings of named functions to suppress,
ie. perhaps because an alternative manually tweaked verson is included in
the GLOBAL section." Did this by importing strlist module from libADTs.
An Example of Datadec in Action
-------------------------------
......
......@@ -4,13 +4,20 @@ datadec \- ANSI C data declaration module constructor
.SH SYNOPSIS
datadec
.RB [\- vfno ]
.RB [\- s FUNCTIONNAME...]
.I basename
.RB [infile]
.RB infile
.PP
OR
.PP
datadec
.RB [\- m ]
.RB infile
.SH DESCRIPTION
.B Datadec
takes an input file - or stdin if no input file is given -
containing a series of Haskell style recursive (or inductive) datatype
declarations - with optional hints on printing and freeing,
takes an input file containing a series of Haskell style recursive (or
inductive) datatype declarations - with optional hints on printing and freeing,
and builds an definition/implementation pair of ANSI C files \-
.I "basename.h"
and
......@@ -39,6 +46,25 @@ do not perform various optimizations.
.TP
.B "\-o"
perform optimizations (the default).
.TP
.B "\-m"
Do **NOT** produce the normal module. Instead, produce
.I "meta-data"
on stdout. This lists, for each type and shape, the typename,
the shapename, and a comma-separated list of the shape parameter
types.
.TP
.B "\-s FUNCTIONNAME"
Add
.I FUNCTIONNAME
to the set of functions to be suppressed, i.e. not generated in the
output .c file.
Currently this is only implemented for print_TYPE functions,
but it would be trivial to implement for constructors, deconstructors
and free functions too.
The usual reason for using this suppression feature is because you
have provided a manually constructed optimized print function in the
GLOBAL section of the datadec input file.
.SH "AN EXAMPLE"
.PP
......@@ -62,6 +88,13 @@ which generates
and
.I "eek.c"
.fi
If you wanted to suppress print_idtree and print_illist from eek.c
(while leaving their prototypes in eek.h), you would run:
.nf
datadec -s print_idtree -s print_illist eek data.in
.fi
.SH THE DATA DECLARATION LANGUAGE
The language accepted by datadec is split into two components:
......@@ -105,6 +138,12 @@ Similarly, the contents of the
section are placed in the C file,
again with `@@' being used to split the global section into "top of file" and
"bottom of file" pieces.
Note that the "top of file" piece is placed ABOVE the '#include "thismodule.h"
and thus CANNOT use the inductive data types themselves, however this makes it
the perfect place to add #include's to allow the data types to use other
types as fields in the shapes. For example, should one of your inductive
data types use a "set h" parameter in a shape, #include "set.h" might need
to go into GLOBAL above the @@.
.PP
Similarly, the contents of the
......@@ -120,7 +159,7 @@ section contains the type declarations themselves - the inner language.
.PP
The "inner language" - that of specifying the actual types section -
is closely modelled on Miranda or Hope, with printing rules added.
is closely modelled on Haskell, Miranda or Hope, with printing rules added.
Here is the grammar:
.PP
......
......@@ -25,9 +25,11 @@
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdbool.h>
#include <string.h>
#include "set.h"
#include "struct.h"
#include "lexer.h"
#include "parser.h"
......@@ -35,69 +37,61 @@
#include "optimize.h"
#define USAGEMSG "Usage: datadec [-vnof] outfile infile\nOr: datadec -m infile\n"
#define USAGEMSG "Usage: datadec [-vnof] [-s FNAME [-s OTHERFNAME...]] outfile infile\nOr: datadec -m infile\n"
#define MUSTBE(b) {if(!(b)){fprintf(stderr, USAGEMSG);exit(1);}}
#define NEED_ANOTHER_ARG MUSTBE( IS_ANOTHER_ARG )
#define REQUIRE_NO_MORE_ARGS MUSTBE( argc == arg )
#define IS_ANOTHER_ARG (argc > arg)
#define CHUNKSIZE 10000
typedef char bigstr[ CHUNKSIZE ];
set suppress_funcs;
int main( int argc, char **argv )
{
char *basename;
char *s;
declnlist declns;
int len;
int arg;
bigstr exports, globals, begin;
int gor;
arg = 1;
NEED_ANOTHER_ARG;
verbose = false; opt = true;
suppress_funcs = setCreate( NULL );
verbose = false;
opt = true;
metaonly = false;
while( *(s=argv[arg]) == '-' )
while( (gor = getopt(argc, argv, "vfnoms:") ) != -1 )
{
for( s++; *s; s++ )
switch( gor )
{
switch( *s )
{
case 'v':
verbose = true;
break;
case 'f':
makefree = true;
break;
case 'n':
opt = false;
break;
case 'o':
opt = true;
break;
case 'm':
metaonly = true;
break;
default:
fprintf( stderr,
"datadec: illegal option -%c\n", *s );
MUSTBE(false);
exit(1);
}
case 'v':
verbose = true;
break;
case 'f':
makefree = true;
break;
case 'n':
opt = false;
break;
case 'o':
opt = true;
break;
case 'm':
metaonly = true;
break;
case 's':
setAdd( suppress_funcs, optarg );
break;
default:
MUSTBE(false);
}
arg++;
NEED_ANOTHER_ARG;
}
MUSTBE( optind < argc );
if( ! metaonly )
{
basename = argv[arg++];
basename = argv[optind++];
len = strlen( basename );
if( !strcmp( basename+len-2, ".c" ) )
{
......@@ -105,21 +99,18 @@ int main( int argc, char **argv )
}
}
NEED_ANOTHER_ARG;
//if( IS_ANOTHER_ARG ) {
lexfile = fopen( argv[arg], "r" );
if( lexfile == NULL )
{
fprintf( stderr, "datadec: can't open '%s'\n",
argv[arg] );
exit(1);
}
arg++;
//} else {
// lexfile = stdin;
//}
MUSTBE( optind < argc );
lexfile = fopen( argv[optind], "r" );
if( lexfile == NULL )
{
fprintf( stderr, "datadec: can't open '%s'\n",
argv[optind] );
exit(1);
}
optind++;
REQUIRE_NO_MORE_ARGS;
MUSTBE( optind == argc );
if( ! parse_data( exports, globals, begin, &declns ) )
{
......@@ -133,15 +124,20 @@ int main( int argc, char **argv )
printf( "exports = {%s}\n", exports );
printf( "globals = {%s}\n", globals );
printf( "begin = {%s}\n", begin );
printf( "suppress = " );
setDump( stdout, suppress_funcs );
printf( "\n" );
}
optimize( declns );
if( ! metaonly )
{
make_declns( exports, globals, begin, declns, basename );
make_declns( exports, globals, begin, declns, suppress_funcs,
basename );
} else
{
make_metadata( declns );
}
setFree( suppress_funcs );
exit(0);
/*NOTREACHED*/
}
......@@ -4,6 +4,7 @@
#include <string.h>
#include "struct.h"
#include "set.h"
#include "decs.h"
bool makefree = false;
......@@ -11,7 +12,7 @@ bool makefree = false;
//static void line( char * fmt, long a, long b, long c, long d );
static void literalline( char * mesg );
static void h_declns( char * base, char * exports, bool init, declnlist d );
static void c_declns( char * base, char * globals, char * begin, declnlist d );
static void c_declns( char * base, char * globals, char * begin, set suppress, declnlist d );
static void ddtypes( declnlist d );
static void ddoneshape( decln d, shapelist s );
static void data_decls( declnlist decs );
......@@ -29,7 +30,7 @@ static void deconskind_fn( decln d );
static void decons_fn_proto( decln d, shapelist s, bool prot );
static void decons_fn( decln d, shapelist s );
static void print_fn_proto( char * name, bool prot );
static void print_fns( declnlist d );
static void print_fns( declnlist d, set suppress );
static void print_fn_shape( declnlist d, shapelist s );
static void print_all_params( declnlist d, shapelist s );
static void print_param( shapelist s, paramlist p, bool Union );
......@@ -75,11 +76,11 @@ static void literalline( char *mesg )
}
void make_declns( char *exports, char *globals, char *begin, declnlist d, char *base )
void make_declns( char *exports, char *globals, char *begin, declnlist d, set suppress, char *base )
{
printf( "datadec: Making data declarations in %s.[ch]\n", base );
h_declns( base, exports, *begin != '\0', d );
c_declns( base, globals, begin, d );
c_declns( base, globals, begin, suppress, d );
}
......@@ -113,7 +114,7 @@ static void h_declns( char *base, char *exports, bool init, declnlist d )
exportptr = exports;
if( *exports != '\0' )
{
line( "/* Contents of EXPORT section */" );
line( "\n/* Contents of top part of EXPORT section */" );
for( ; *exportptr; exportptr++ )
{
if( *exportptr == '@' && exportptr[1] == '@'
......@@ -146,7 +147,7 @@ static void h_declns( char *base, char *exports, bool init, declnlist d )
}
static void c_declns( char *base, char *globals, char *begin, declnlist d )
static void c_declns( char *base, char *globals, char *begin, set suppress, declnlist d )
{
char tempname[256];
FILE *cfile;
......@@ -168,12 +169,12 @@ static void c_declns( char *base, char *globals, char *begin, declnlist d )
line( "#include <stdio.h>" );
line( "#include <stdlib.h>" );
line( "#include <stdbool.h>" );
line( "#include \"%s.h\"\n\n", base );
line( "#include <assert.h>" );
globalptr=globals;
if( *globals != '\0' )
{
line( "/* Contents of GLOBAL section */" );
line( "\n/* Contents of top part of GLOBAL section */" );
for( ; *globalptr; globalptr++ )
{
if( *globalptr == '@' && globalptr[1] == '@'
......@@ -189,9 +190,11 @@ static void c_declns( char *base, char *globals, char *begin, declnlist d )
nl();
}
line( "#include \"%s.h\"\n\n", base );
cons_fns( d );
decons_fns( d );
print_fns( d );
print_fns( d, suppress );
if( makefree )
{
free_fns( d );
......@@ -530,6 +533,7 @@ static void cons_fn( decln d, shapelist s )
line( "\n{" );
indent();
line( "%s\tnew = NEW(%s);", d->name, d->name );
fprintf( outfile, "\tassert( new != NULL );\n" );
if( d->TagField )
{
line( "new->tag = %s_is_%s;", d->name, s->name );
......@@ -770,16 +774,21 @@ static void print_fn_proto( char *name, bool prot )
}
static void print_fns( declnlist d )
static void print_fns( declnlist d, set suppress )
{
for( ; d != NULL; d = d->next )
{
shapelist s;
line( "void print_%s( FILE *f, %s p )",d->name, d->name );
char fname[2048];
sprintf( fname, "print_%s", d->name );
if( setIn( suppress, fname ) )
{
fprintf( stderr, "Suppressing %s\n", fname );
continue;
}
line( "void %s( FILE *f, %s p )", fname, d->name );
line( "{" );
indent();
s = d->shapes;
shapelist s = d->shapes;
if( d->UseNull )
{
line( "if( p == NULL )" );
......
extern bool makefree;
extern void make_declns( char * exports, char * globals, char * begin, declnlist d, char * base );
extern void make_declns( char * exports, char * globals, char * begin, declnlist d, set suppress, char * base );
extern void make_metadata( declnlist d );
/*
* set.c: set (based on hashes) storage for C.. with NHASH tweaked to be
* much smaller (197) than usual..
*
* (C) Duncan C. White, 1996-2017 although it seems longer:-)
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>
#include "set.h"
#define NHASH 197
typedef struct tree_s *tree;
struct set_s {
tree * data;
setprintfunc p;
int nmembers;
};
struct tree_s {
setkey k; /* key aka set member */
bool in; /* in, i.e. not deleted */
tree left; /* Left... */
tree right; /* ... and Right ptr's */
};
/*
* operation
*/
typedef enum { Search, Define, Exclude } ops;
/* Private functions */
static void adddelop( setkey k, void * v );
static void exclude_if_notin_cb( setkey k, void * arg );
static void diff_cb( setkey k, void * arg );
static void dump_foreachcb( setkey k, void * arg );
static tree talloc( setkey k );
static int shash( char * str );
static tree symop( set s, setkey k, ops op );
static void foreach_tree( tree t, setforeachcb f, void * arg );
static tree copy_tree( tree t );
static void free_tree( tree t );
static int depth_tree( tree t );
/*
* Create an empty set
*/
set setCreate( setprintfunc p )
{
set s = (set) malloc( sizeof(struct set_s) );
s->data = (tree *) malloc( NHASH*sizeof(tree) );
s->p = p;
s->nmembers = 0;
int i;
for( i = 0; i < NHASH; i++ )
{
s->data[i] = NULL;
}
return s;
}
/*
* Empty an existing set - ie. retain only the skeleton..
*/
void setEmpty( set s )
{
int i;
for( i = 0; i < NHASH; i++ )
{
free_tree( s->data[i] );
s->data[i] = NULL;
}
s->nmembers = 0;
}
/*
* Copy an existing set.
*/
set setCopy( set s )
{
int i;
set result;
result = (set) malloc( sizeof(struct set_s) );
result->data = (tree *) malloc( NHASH*sizeof(tree) );
result->p = s->p;
result->nmembers = s->nmembers;
for( i = 0; i < NHASH; i++ )
{
result->data[i] = copy_tree( s->data[i] );
}
return result;
}
/*
* Free the given set - clean it up and delete it's skeleton too..
*/
void setFree( set s )
{
int i;
for( i = 0; i < NHASH; i++ )
{
free_tree( s->data[i] );
}
free( (void *) s->data );
free( (void *) s );
}
/*
* Set metrics:
* calculate the min, max and average depth of all non-empty trees
* sadly can't do this with a setForeach unless the depth is magically
* passed into the callback..
*/
void setMetrics( set s, int *min, int *max, double *avg )
{
int i;
int nonempty = 0;
int total = 0;
*min = 100000000;
*max = -100000000;
for( i = 0; i < NHASH; i++ ) {
if( s->data[i] != NULL )
{
int d = depth_tree( s->data[i] );
if( d < *min ) *min = d;
if( d > *max ) *max = d;
total += d;
nonempty++;
}
}
*avg = ((double)total)/(double)nonempty;
}
/*
* Add k to the set s
*/
void setAdd( set s, setkey k )
{
(void) symop( s, k, Define);
}
/*
* Remove k from the set s
*/
void setRemove( set s, setkey k )
{
(void) symop( s, k, Exclude);
}
/*
* Convenience function:
* Given a changes string of the form "[+-]item[+-]item[+-]item..."
* modify the given set s, including (+) or excluding (-) items
* NB: This assumes that key == char *..
*/
void setModify( set s, setkey changes )
{
char *str = strdup( changes ); /* so we can modify it! */
char *p = str;
char cmd = *p;
while( cmd != '\0' ) /* while not finished */
{
assert( cmd == '+' || cmd == '-' );
p++;
/* got a string of the form... [+-]itemstring[+-\0]... */
/* cmd = the + or - command */
/* and p points at the first char ^p */
/* find the next +- command, ^q */
char *q = p;
for( ; *q != '\0' && *q != '+' && *q != '-'; q++ );
/* terminate itemstring here, remembering the next cmd */
char nextcmd = *q;
*q = '\0';
/* now actually include/exclude the item from the set */
if( cmd == '+' )
{
setAdd( s, p );
} else
{
setRemove( s, p );
}
/* set up for next time */
cmd = nextcmd; /* the next command */
p = q; /* the next item */
}
free( (void *)str );
}
/*
* Look for something in the set s
*/
int setIn( set s, setkey k )
{
tree x = symop(s, k, Search);
return x != NULL && x->in;
}
/*
* perform a foreach operation over a given set