CADE structure diagrams

I'm going to be using these diagrams quite a bit to illustrate how the elements of an HTML file fit together. They come from a system called CADE (Computer-Aided Document Engineering) made by Microstar, implemented in their Near&Far program. The `family-tree' structure is used to indicate which elements belong inside which other elements, and thus show where they can be used.

The `root' element in our case is always <html>, signified by the black rectangle to its left. The connector bracketing shows what each element can be made up of: square backeting means that the order of the elements is significant; curved bracketing means it is not; angled straight-line bracketing simply means you can choose arbitrarily from among the allowed elements, as your needs determine.

Elements are preceded by a symbol which indicates how many times an element can be used (occurrence indicators):

Compulsory
- An open square means the element is compulsory, but may occur only once;
Optional
- A filled square means the element is optional, but if present, it may only occur once;
Compulsory but repeatable
- An overlapped filled square means the element is compulsory but may occur more than once;
Optional and repeatable
- An overlapped open square means the element is optional, but may occur any number of times.

Elements with no symbol are compulsory (first group above). Ultimately, most elements end up containing text. CADE diagrams use five symbols to show what content an element can have:

PCDATA
- Processable character data: text which may include further element markup and character entities;
RCDATA
- Replaceable character data: text which may contain character entities but not element markup;
CDATA
- Character data: plain text which may not contain any markup or character entities;
ANY
- Any text: PCDATA or further occurrences of elements;
EMPTY
- Empty: the element may not have any content.

An element name followed by a tilde () means that the element can have `attributes' which modify its meaning. Some more complex constructions, such as the definitions in HTML3, use Inclusions (the symbol) and Exclusions (the symbol) to prevent some elements containing unwanted material, such as paragraph matter within mathematics. SGML also has a shorthand way of referring to a group of elements, called a `parameter entity': these are shown in the CADE diagrams preceded by a percent sign. This avoids having to display the full content definition of a frequently-used element (like a paragraph) every time it gets referred to.