Silver's Simple Site - Weblog - Simis Jinx 3rd Level File Formats


Simis Jinx 3rd Level File Formats

The Simis file format with the 2nd-level Unicode text and binary Jinx formats are a pretty generic set of formats; they contain an arbitrarily nested tree structure with strings, integer and floating point numbers at any level. To actually interpret and describe the contents, a 3rd level of formats is needed.

As mentioned in both Simis Jinx Unicode Text File Format and Simis Jinx Binary File Format, this 3rd level of formats is identified by a letter and a number - and there are quite a lot of them. To actually define these formats in a useful way, though, we need to use another format - Backus-Naur Form (BNF). The exact format I've used is a variant of the standard Backus-Naur Form derived from the BNF files that shipped with Microsoft Train Simulator itself (in the UTILS\FFEDIT directory).

Train Simulator Backus-Naur Form

The BNF files are text; new lines have no significance; any of ASCII, UTF-8 and UTF-16 character encodings can be used, provided a byte order mark is included to identify UTF-8 and UTF-16. The files are made up of a number of definitions and productions - in any order - and a special termination marker.

Definitions specify a shared or standalone expression. Any other expression can reference it and has their reference expanded to the expression on the right-hand side of the equals ("=").

Productions specify, through the expression on the right of the arrow ("==>"), what is allowed/expected inside the block identified by the name on the left.

The expressions in both definitions and productions contain a space-separated list of items, each of which can be:

  • A string literal, e.g. "Activity".
  • A pre-defined data type, e.g. :sint. Available data types:
    • uint
    • sint
    • dword
    • string

    Data types can additionally be named, by including a comma and identifier after the type, e.g. :sint,TileX.

  • Another production or definition, e.g. :Tr_Activity.

There are three operators allowed within expressions:

  • Square brackets, denoting an optional section, e.g. [:Description].
  • Curly brackets, denoting a repeatable section (1 or more times), e.g. {:UiD :SidingItem}.
  • Pipe symbol, denoting a choice between sections, e.g. :Engine|:Wagon.

The choice operator (pipe) binds tighter than whitespace. Therefore, the expression :foo :bar|:baz means "foo followed by either bar or baz".

The end of an expression is denoted by a period (".").

Comments can be placed anywhere whitespace is allowed and use the common multi-line comment syntax of "/*" to start and "*/" to finish.

Termination of the BNF is indicated by the identifier "EOF". Everything after this is completely ignored.

3rd-level Format BNFs

Here's the current route car spawn.bnf as an example:

/* File format information */
FILE                          = :uint,Count [{:CarSpawnerItem}] .
FILE_NAME                     = "Route Car Spawn" .
FILE_EXT                      = "carspawn.dat" .
FILE_TYPE                     = "v" .
FILE_TYPE_VER                 = "1" .

/* Base types */
CarSpawnerItem                ==> :string :uint .

/* Format types */

EOF                           /* End of file */

All BNFs for the tools are required to have the five definitions shown above, so that the various programs can use them. FILE_TYPE and FILE_TYPE_VER are the letter and number (both as strings) used in all Simis Jinx files. FILE_EXT is either a file extension (e.g. "act") or a filename (e.g. "carspawn.dat") which selects which files can contain this format. FILE_NAME is a name suitable for displaying to the user. FILE is an expression representing the root of the file - the base of all parsing.

Binary Block Type Names

While the BNFs define what is allowed where, there is still one remaining problem for the Simis Jinx Binary format - each block type is identified by a number, not a string. For this, we can turn to some other files included with the original Train Simulator - the files in UTILS\FFEDIT.

  • sidn.txt defines a few base IDs, including "core" and "train" (0 and 4 respectively).
  • coreids.tok contains a list of all core "tokens" - i.e. block type names - in order of the numerical value.
  • appids.tok is a C header which includes forms.hdr and loadstr.hdr with a token defined before and after each inclusion.
  • forms.hdr and loadstr.hdr contain lists of all MSTS tokens in numerical order.

To construct the 32bit unsigned number used in the Simis Jinx Binary file format, the base ID and the token ID (from its position) are combined with the base forming the most significant 16bits and the token the least significant 16bits. E.g. the 7th "train" token would be 0x00040007.

Conclusion

Together with the BNFs, the number-block type name mapping completes the picture for loading and saving Simis Jinx files. However, as the BNFs are of my own construction, they are necessarily incomplete and possibly still inaccurate in some areas. This has improved a lot over the past few months, and will continue to do so, providing a good, solid and generic reading and writing capability for most Simis Jinx files.

Permalink | Author: | Tags: Format, Games, Microsoft, Simis, Train Simulator | Posted: 11:55PM on Sunday, 23 May, 2010 | Modified: 12:02AM on Monday, 24 May, 2010 | Comments: 0

Powered by the Content Parser System, copyright 2002 - 2024 James G. Ross.