Scratchy Labs:AustenX

Austen manual

Introduction

Skeleton is an ongoing project to produce a tool for building code-generating tools. It is a code-generator code generator. It, to me, is not exactly a template engine, but rather a declarative language based approach that is target language aware. That is, it is sort of a programming language that currently produces generated source code, for generating end point source code. Currently, it produces Java code to produce more Java code. I believe it has some interesting ideas behind the approach, but after much work, it is still very much a work in process, but much progress has been made to work out how it should roughly work in practice (when I started I really had no idea).

I appologise in advance to any confusion that may arise from what exactly is generating code for what, especially when one version of Skeleton is generating code for another version (which is generating code for the first version). If you get confused please take a moment to appreciate the mental anguish that I have had to endure in writing this!

Version history

The current version of Skeleton is actually two seperate versions of the tool, that work in subtle, but significantly different ways. Skeleton has progressed through multiple version.

The first I call "OldSkeleton" (though obviously, this was not always the case). This version produced its code through hand-written code generating code. It had a number of shortcomings and dead ends in its approach. It was hard to maintain. The OldSkeleton description language was heavily Java directed.

The second version of Skeleton, I call Skeleton. This version had the code-generating part constructed by OldSkeleton, but was significantly different from OldSkeleton that it was not straightforward to get it to generate its code-generating code by itself. Skeleton moved toward abstracting away the target code generation from the user of the generated code (told you this gets confusing), by switching to a two part description - one of the form of generated documents, and the other part being specific implementations, for individual languges. Only Java was implemented.

For the third version, which is rather a different branch of development, I thought it would be easier to reimplement OldSkeleton using the Skeleton generator, and thus create two co-dependent generators. This is what I embarked on for OldSkeletonX, but the project became more indepth, and I refined some of the concepts of Skeleton and OldSkeleton, so while OldSkeletonX is based on the inferior OldSkeleton, it in someways has novel features lacking in Skeleton.

The last significant version is SkeletonX - the current implementatin of Skeleton, that uses OldSkeletonX to create its code-generation code. This version also has some shortcomings (because this project has primarily, for me, been an exercise in creating something for which I did not really know how it should work, when I started).

There is one more possible version one might consider, which is the version of OldSkeletonX that uses SkeletonX to generate its code-generation code, but it works essentially as the earlier version (with a few less bugs).

The remainder of this document deals with both SkeletonX, and OldSkeletonX, explaining the shared philosophy, the similarities, and the differences between the two versions.

The SkeletonX Design

For this section we will focus on SkeletonX, which has a specific design component. In SkeletonX input there must be somewhere a design specification. For example:

design Example {
	House(String name) {
		Door(String colour, House.Room exitRoom) {
		
		}
		Room(String roomName, int numberOfWindows) {
			Furniture() {
				Sofa; Chair; Table;
			}
		}
		Roof(double width)?;
	}
}

The above example illustrates most of the siginficant concepts of a design. A design first has a name (in this case "Example", and then it has a description of the elements that make up the data of the model. In this case there is one root node type - the "House" type, which has a single parameter - a string specifying the name (of the house). A House node has a number of possible children nodes. The three types the children can have are "Door", "Room", and "Roof".

In general, there can be zero or more of a particular type of child. If the "?" symbol is used, as in the example of the "Roof" child type, there can either be zero or one (there is some support for a fixed child but that is inconsistent and seldom used in the current version).

A child node can itself have an number of children (like the "Room" type in the example). Each child may have one or more parameters. If there are no parameters the "()" brackets can be left off (the semi-colon is probably optional - if it isn't it will be at some point).

Parameters can be either "primitives" ((s/S)trings*, doubles, and ints) or references to other nodes. So, for example "Door" child nodes (of "House" nodes) have a "colour" parameter, of type String. Door nodes also have and "exitRoom" parameter which is a reference to a Door node. If you want a list of references to nodes, you need to create a child, with a single reference as a parameter (then the multiple children become the list). Future versions may include array parameters of references.

* as a long-time Java programmer I tend to write "String", even though in things I develop String is usually a primitive type - "string" is always taken as equivalent.

There are additional parts to the design, such as templates, but I will leave them for another time.

The SkeletonX Java implementation

Given a model to a specific design, how does SkeletonX produce code? The implementation of the specification caters for this. This side of things is very similar to what OldSkeletonX uses, except the model is more integrated in the case of OldSkeletonX (this will be described later). The concepts are similar.

An example implementation might look like this:

@java ExampleJava implements Example {
	public class Example {
		public void moo() {
		}
	}
}

The above example is a Java based implementation for "Example" modeled data. Note the use of "@java". With the implementation side of things some keywords have "@" at the start to identify them as "meta" keywords. The exact use of this may change over time (there isn't really a need for it in terms of parsing, but in the absence of syntax-colouring, it helps highlight SkeletonX keywords, over Java ones). This implementation does not do much except generate code that generates a class called "Example", with only one method "moo".

Let's take this slow

So before we go on, I think we should go over the above example slowly. Given the above input (and some other boilerplate bits), SkeletonX would produce a bunch of Java code, none of which includes a class called "Example". Rather, the code genarated, would, when used by the user tool, generate a file called "Example.java", with the specified class.

Back to things

This particular implementation is not very productive because it pays no attention to the data-model provided (how this is provided is explained soon). Regardless of the data provided, the output is always one class, with one method. What needs to be done is the data model needs to be linked in with the output.

Linking

Quite why this is called linking I'm not too sure but the process revolves around a keyword "link" (actually "@link", so we might as well keep with that. The way Skeleton works is to mix in model information with the source code. This is done by "link" blocks, that are in fact linked in specific ways. It can be best explained with an example:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			public void moo() {
			}
		}
	}
}

In the above example (using the data model previously described), the code generated would be able to be used to generate a class called "Example", that had a "moo" method for each "House" child of the provided model. Of course, there can be only one method (with a no-parameter signature) in a class, in Java, so in this case, if, say, there were two "House" children in the model, one method named "moo", and one named "moo_1".

The way that methods are named when there is more than one child is not particularly useful, so a more refined approach is provided. Consider the following adaptation:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			public void moo[$name,Moo]() {
			}
		}
	}
}

The bit in the square brackets allows specific details about the method name to be given, for each child. In this case the name is formed first by referencing the "name" variable of the "House" child (see the previous model definition), followed by the constant "Moo". So if my model had a "House" child with name="One", and child with name="Two", then the following class would be generated (using the code generated by SkeletonX, not in the code generated by SkeletonX):

public class Example {
	public void oneMoo() { }
	public void twoMoo() { }
}

Notice how the names are lower case at the start. SkeletonX (or Bones, the raw code generator), roughly reconfigures identifiers to be consistent with general Java norms (eg, methods are lower-case starting camelCaps). It does not always get this right though (for future revision).

Sharing across links

The important thing about linking, and why this is more than just a template-engine thing, is the way that multiple "link" blocks, are visible to each other. For example, the following does not work:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			public void moo[$name,Moo]() {
			}
		}
		public void mooTest() {
			moo();
		}
	}
}

This produces a compile error (from SkeletonX), as in the method "mooTest", the method "moo" is not visible (because it is defined in the "House" link block). (I know you are thinking - is it not just because I'm saying "moo()", and not "oneMoo()", or "$name,Moo()", but I'm sort of skipping over things so you'll pick it up as we go).

The correct way to do the above (or at least, a way that compiles) is the following:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			public void moo[$name,Moo]() {
			}
		}

		public void mooTest() {
			@link House {
				moo();
			}
		}
	}
}

In the above example, there are two link blocks, both for "House". One is around a method, and one is around a statement within a method. Also valid is around classes themselves (and in a few other places as will be seen). Because the call to "moo()" is within the "House" linkage the compiler knows which "moo" method you are talking about. One does not refer to the method by the naming instructions provided (the bit in the square brackets), just by the base name. The base name is what is used by SkeletonX, the naming bits are just what is used in the final generated code.

Again, with the two "House" children with "name"s "One" and "Two" respectively, given the above SkeletonX definition, and using the code generator generated by that defintion (so again, not the code generated by SkeletonX, but through the user code using that generator, with the given model) we would get the following generated:

public class Example {
	public void oneMoo() { }
	public void twoMoo() { }
	public void mooTest() {
		oneMoo();
		twoMoo();
	}
}

Now, hopefully, one is getting a feel for the point of SkeletonX. The link between the generated method names is automatically created. This is not just a simple template engine. Of course, it just gets more complex from here.

Nested linkages

Now we have a link block around the first child of a model (in the model we'll be using, defined above, that is the "House" children), it is now time to look at linkages around the children of a model node. In our case there are three types of children nodes of a "House" node: "Door", "Room", and "Roof". We could create a link block around each of those if we wished. Consider:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			@link Door {
				public void doorMethod[$colour,DoorFor,$name,House]() {
				}
			}
		}

		public void mooTest() {
			@link House.Door {
				doorMethod();
			}
		}
	}
}

In the above example, once in a "House" link block one can then link with one of the children node types (in this case "Door" children). The final name of each method is defined first by reference to the "colour" variable of the Door child, the constant "DoorFor", then the name of the parent "House" node, then the constant "House". Note the syntax of the link block in the "mooTest()" method. The "@link House.Door" is just short hand to link with "Door" children in "House" children (it is equivalent to just nesting the two blocks as used previously).

One can also link multiple children:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			@link Door {
				public void doorMethod[$colour,DoorFor,$name,House]() { }
			}
			@link Room {
				private int windowCount_[$roomName, WindowCount_] = $numberOfWindows;
			}
		}

		public void mooTest() {
			@link House.Door {
				doorMethod();
			}
			int totalNumberOfWindows = 0;
			@link House.Room {
				totalNumberOfWindows += this.windowCount_;
			}
		}
	}
}

The above example does linkages around both the "Door" and "Room" children of a "House", and introduces a few more features of SkeletonX, being just more Java usage. In this case, the link blocks around "Room" children creates instance variables and not instance methods, but that is still accessible in different places. Note in this case the SkeletonX does not count the number of windows, nor does the code generated by SkeletonX. It is the code generated by the user tool (using code generated by SkeletonX) that would produce code that would added up the windows (sorry for over stressing this, but it can be confusing).

Places one can link

Just a note on where one can link things. First, one can link at the "root", around classes (so a link block can outside a class, but contain classes - or interfaces). Second, it can be, as seen within classes (or interfaces) and can contain methods, instance variables, and sub classes/interfaces. Note that the namespaces of types is limited in general — if one has a sub class named "Foo" in one class, another class named "Foo" in another class, yet within the same linkage scope, will probably conflict, and cause an error (so even though in Java land, the classes are outside scope, in (Old)SkeletonX it's the linkage scope that matters more - this will change if it hasn't already). Method names, and instance variables do not have this issue.

Linkages can occur within a method, so around segments of code. Local variables created within a linkage block in a section of code, can be seen by matching blocks later in the code. Linkage blocks can occur around "extends", "implements", and "throws" declarations. Linkage blocks can also be in parameter definitions, as well in parameter value passing.

An example, not using the previously defined model, that uses all possible linkage types, is as follows:

@java ExampleJava implements RandomModel {
	@link A {
		public class AClass[$name, AClass] {
			@link B {
				public void moo(int initialValue, @link C value, int lastBit) {
					int x = initialValue;
					@link C {
						x+=value;
					}
					x+=lastBit;
				}
			}
		}
		public interface Foo[$name, Foo] { }
	}
	public interface Goo { }
	public class MainClass implements @link A { Foo }, Goo {
		@link A {
			public AClass thingy_;
		}
		public void baa(@link A AClass thingBase) {
			@link A {
				this.thingy_ = thinkBase;
				@link B {
					this.thingy_.moo(20, @link C $defaultValue, 50);
				}
			}
			
		}
	}
}

Crossref linkages

The next form of linkage is more complex, but significant in the usage of SkeletonX is "cross reference" linkages. This gets tricky because it is here where the first significant differences between OldSkeletonX and SkeletonX show up. It is also where the current limits of the language become apparent. It's not that bad though, and OldSkeletonX is on the right track, I feel, to getting this correct.

The cross reference comes from the fact that children contain references to other children. In the example model we see that "Door" children (of "House") nodes have a reference to an "exitRoom", which is of type "Room" child of "House" — written as "House.Room" (the full "path" must be specified - "Room" will not current suffice, even though "Room" is a co-child-type of "Door").

		Door(String colour, House.Room exitRoom) { }

The crossref linkage is one across not a child but a cross reference. So, in the example we could link in regard to the the "exitRoom" reference. Things are different between OldSkeletonX, and SkeletonX in that OldSkeletonX requires that one link via the cross reference, but still over a child of that referenced node, whereas SkeletonX allows one to link via the cross reference alone. What this means will become apparent soon, but strange as it seems, I think the OldSkeletonX way is the way to go. I have chosen to take on some verbosity in the language to avoid confusion (things get more verbose in OldSkeletonX as we compilicate things).

In general, referencing of Skeleton variables, and not the target language (Java) variables, is marked via the use of "$". So to refer to the "exitRoom" variable, we'd write "$exitRoom". To link over a child of a variable we'd write "@link $variable.Child". So, to link over the Furniture children of the "exitRoom" reference, the following could be used:

@java ExampleJava implements Example {
	public class Example {
		@link House {
			@link Door {
				public void doorMethod[$colour,DoorFor,$name,House]() { }
			}
			@link Room {
				@link Furniture {
					private int furnVariable_ = 0;
				}
			}
		}

		public void mooTest() {
			@link House.Door {
				@link $exitRoom.Furnture {
					this.furnVariable_ = 5;
				}
			}
		}
	}
}

Scope across references

Scope of items is the major sticking point for SkeletonX, and is where it all falls apart. Exactly how to manage scope is not something I feel I have exactly worked out in an acceptable way. The OldSkeletonX approach seems better to me, but is more verbose. Part of the problem is also the mixing of the scope of the target language and the scope of SkeletonX. The first thing that needs explaining is what I call scope layers.

Layers

The link blocks form "layers" in SkeletonX, which manage scope (mostly). The layers are determined by the link blocks, and in particular, the order in which things are linked. If we have a link block around a direct child (eg, @link House) this creates another layer (a "House" layer), that is below the layer in which the @link occurs. So, a layer structure might look like:

root

House

Door

In this example, there are three layers, where the root layer is the initial layer before a link block, and each layer represents a link block. In this case, in the code I refer to these as "Foundation" layers, as they are around link blocks matching the data model exactly (with no cross referencing, or "back refs" - not yet explained).

Not in the above, this is just one set of layers - there are others which just match, for Foundation layers, the paths from the root to each node type in the data model (I hope that makes sense).

What matters is that any target language stuff in a layer is visible to any other target language stuff in the same layer, given the normal target language scope rules. Also, any target language stuff in higher layer is visible to an other target lnaguage in the lower layer, given the normal target language scope rules.

So, if a method is defined in the "House" layer, then it is visible to a segment of code in the House layer (even if in a different link block), as long as that method would be visible in the target language. If a variable is define in one block of target code, and is referenced in a different block of code that is out of target language scope, but in the same layer, it won't be visible and an error will occur. Eg:

	public void scopes() {
		if("cat"=="dog") {
			@link House {
				int x =4;
			}
		} else {
			@link House {
				x = 5;
			}
		}
		@link House {
			System.out.println("X is:"+x);
		}
	}

The above example will flag two errors: the first being the "x=5", because "x" is not defined in the same target scope, and same for "System.out.print...", even though both are in the same layer as the "int x=4;" declaration.

Complex layers

Where things get dicey is when cross reference layers are used. In SkeletonX there are two types of layers — "node" layers, and "variable" layers. OldSkeletonX dumps this distinction (which means more typing, but makes more sense), but there are cross referenced layers. Sticking with SkeletonX, if we used the link command "@link House.Door.$exitRoom.Furnture { ...", we would end up with a layer structure like the following:

root

House

Door

$exitRoom

Furniture

TO BE CONTINUED...