Hi,
after receiving some encouragement from Andrew to keep going on the
protocol description language project I am working on for generating
Ethereal decode modules for the SMB protocol and other things, I though I
would put down some thoughts and ask for ideas.
Currently I have a number of problems. Some of these are parser problems
that can be fixed by building a cleaner parser. That is they are simply a
matter of programming.
However, a big problem I currently have is with the language itself.
An example might show the problems. Here is an examply SMB description:
SMB funny-smb { # An SMB is a clause
andx; # Which can contain an andx marker that trigers the parser to
# generate code to handle ANDXs
request { # There can be one or more requests
UCHAR Word Count (WCT) = 2;
USHORT Some Funny Field;
BITFIELD 16 A Funny Field = {
0x01 = { "This value not set" , "This value set" };
};
USHORT Byte Count (BCC);
STRING A String Field;
}
response { # There can be one or more responses ...
# etc ...
}
}
Now, the first big problem that I see it that the language and parser are
tied to generating Ethereal decode modules. I would like something a
little more general and am looking for ideas on this.
I think I need to separate declaration from actions.
Something that comes to mind is that each statement could have a
declaration part and an action part. When combined with:
Include files
Functions?
Type definitions
I could have a language that generates Ethereal decode modules as well as a
one that generates code to to a differential comparison of Samba with say
NT4.0 SP6 of NT 2000?
A master include file could specify the code generated by each action?
So, the syntax might look like:
proto SMB = {
unit funny-smb {
UCHAR Word Count (WCT) [wct] %gen-ushort();
# Which means there is a field of type USHORT and that the action
# gen-ushort should be called with the parse tree element for this
# item to generate code to handle the field.
UCHAR AndXCommand [andxc];
# Declare this item for later use, ie, name the node in the parse tree
UCHAR AndXReserved; # Ignore this field
USHORT AndXOffset [andxoff];
request {
%gen-if(wct, "=", 4); # Generate code to compare wct to 4
USHORT Field 1 %gen-ushort();
USHORT Field 2 %gen-ushort();
etc ...
%end-gen-if(); # End of code to display one format
%gen-if(wct, "=", 3);
USHORT Field 1 %gen-ushort();
USHORT Field 2 %gen-ushort();
etc ...
%end-gen-if();
etc ...
};
response {
USHORT Count [cnt] %gen-ushort(cnt);
... some fields
REPEAT cnt of { # Indicates repeats based on cnt
USHORT Resp 1 %gen-ushort();
USHORT Resp 2 %gen-ushort();
};
};
%gen-andx(andx-command); # Generate code to do the andX
}
There are some other ideas lurking around ... What do you folks think?
Regards
-------
Richard Sharpe, sharpe@xxxxxxxxxx, NS Computer Software and Services P/L,
Samba (Team member www.samba.org), Ethereal (Team member www.zing.org)
Co-author, SAMS Teach Yourself Samba in 24 Hours