Silver's Simple Site - Weblog - 2010 - May


Simis Jinx Binary File Format

This is the second of the 2nd-level (inner) formats for the Simis file format used by Microsoft Train Simulator.

Unlike the text format, the binary format has a simple binary header - but it should look rather familiar. The binary header has the same 16 characters as the text header, but this time they're encoded as single bytes and one of the characters is notably different: the 8th character is "b" for binary, rather than "t" for text (if there existed a single-byte text format this would have been the only difference with this format in the headers).

 00000010   4A 49 4E 58  30 .. .. 62  5F 5F 5F 5F  5F 5F 0D 0A   JINX0..b______..

Just like in the text format, the 2 missing bytes in the middle are the two characters (a letter then a number) that identify the 3rd-level format used. We will get to those next time.

Note: If the 1st-level format is using compression, this header is the first item within the compressed stream; for simplicity, all my examples will be for uncompressed files.

Now follows the actual data...

As I started talking about last time, Simis Jinx files are basic trees; the binary format is nothing more than an alternative representation of the same data. It is, however, more of a challenge to read and write correctly - something the 3rd-level formats will help deal with.

Each node in the tree has an 8 byte header, consisting of a 4 byte unsigned integer identifying the node's type and a 4 byte unsigned integer specifying the length of the contents. The contents consist of an optional name - 1 byte for length plus UTF16-LE characters - for the enclosing node and the child values and nodes.

There are a number of common types of value included:

  • Unsigned integer (4 bytes).
  • Signed integer (4 bytes).
  • Floating-point number (4 bytes).
  • String (2 bytes for length plus UTF16-LE characters).

Let's have a look at an example, GLOBAL\capview.iom, but remember that to correctly parse this I am using the 3rd-level format:

 00000000   53 49 4D 49  53 41 40 40  40 40 40 40  40 40 40 40   SIMISA@@@@@@@@@@

The standard 1st-level header...

 00000010   4A 49 4E 58  30 69 30 62  5F 5F 5F 5F  5F 5F 0D 0A   JINX0i0b______..

2nd-level header indicating a binary version of 3rd-level formal 'i0'.

 00000020   63 00 00 00  64 02 00 00                             c...d...

Node type is 99 (0x63), contents size is 612 bytes (0x264).

 00000020                             00                                 .

Node has no name.

 00000020                                01 00 00  00 00 00 00            .......
00000030   00                                                   .

Two unsigned integer values: 1 and 0.

 00000030      64 00 00  00 35 00 00  00 00                       d...5....

Node type is 100 (0x64), contents size is 53 bytes (0x35) and there's no name.

 00000030                                   CB 00  01 00                   ....

Unsigned integer value: 65,739 (0x100CB).

 00000030                                                15 00                 ..
00000040   6E 00 75 00  64 00 67 00  65 00 5F 00  63 00 61 00   n.u.d.g.e._.c.a.
00000050   62 00 63 00  6F 00 6E 00  74 00 72 00  6F 00 6C 00   b.c.o.n.t.r.o.l.
00000060   5F 00 6C 00  65 00 66 00  74 00                      _.l.e.f.t.

String value: length of 21 (0x15) plus 21 UTF16-LE characters "nudge_cabcontrol_left".

 00000060                                   00 00  00 00                   ....

Unsigned integer value: 0.

The 1 byte for no name, 8 bytes for two unsigned integer values plus 44 bytes for the string mean the total contents are up to 53 bytes - that means it is the end of this node type 100.

 00000060                                                64 00                 d.
00000070   00 00 37 00  00 00 00                                ..7....

Node type is 100 (0x64), contents size is 55 bytes (0x37) and there's no name.

Writing the above in text format would give:

 <node type 99> (
    1
    0
    <node type 100> (
        65739
        nudge_cabcontrol_left
        0
    )
    <node type 100> (
...

It's clear that we're missing the node type names found in the text files, and there's no indication whether some 4 bytes are a new node, an integer or float - some heuristics can work for this some of the time, I found, but in the end this is what the 3rd-level format is for.

Other 2nd-level formats

I've covered the two main Simis 2nd-level formats, but there are others; most notably, the texture files (.ace) are wrapped in a 1st-level Simis header with their own format inside. I won't be covering these other formats soon, as there are already tools that can handle these files sufficiently for Microsoft Train Simulator's needs, and the 3rd-level Simis formats are more interesting anyway.

Permalink | Author: | Tags: Format, Games, Microsoft, Simis, Train Simulator | Posted: 11:07PM on Sunday, 09 May, 2010 | Comments: 0


Simis Jinx 3rd Level File Formats

The Simis file format with the 2nd-level Unicode text and binary Jinx formats are a pretty generic set of formats; they contain an arbitrarily nested tree structure with strings, integer and floating point numbers at any level. To actually interpret and describe the contents, a 3rd level of formats is needed.

As mentioned in both Simis Jinx Unicode Text File Format and Simis Jinx Binary File Format, this 3rd level of formats is identified by a letter and a number - and there are quite a lot of them. To actually define these formats in a useful way, though, we need to use another format - Backus-Naur Form (BNF). The exact format I've used is a variant of the standard Backus-Naur Form derived from the BNF files that shipped with Microsoft Train Simulator itself (in the UTILS\FFEDIT directory).

Train Simulator Backus-Naur Form

The BNF files are text; new lines have no significance; any of ASCII, UTF-8 and UTF-16 character encodings can be used, provided a byte order mark is included to identify UTF-8 and UTF-16. The files are made up of a number of definitions and productions - in any order - and a special termination marker.

Definitions specify a shared or standalone expression. Any other expression can reference it and has their reference expanded to the expression on the right-hand side of the equals ("=").

Productions specify, through the expression on the right of the arrow ("==>"), what is allowed/expected inside the block identified by the name on the left.

The expressions in both definitions and productions contain a space-separated list of items, each of which can be:

  • A string literal, e.g. "Activity".
  • A pre-defined data type, e.g. :sint. Available data types:
    • uint
    • sint
    • dword
    • string

    Data types can additionally be named, by including a comma and identifier after the type, e.g. :sint,TileX.

  • Another production or definition, e.g. :Tr_Activity.

There are three operators allowed within expressions:

  • Square brackets, denoting an optional section, e.g. [:Description].
  • Curly brackets, denoting a repeatable section (1 or more times), e.g. {:UiD :SidingItem}.
  • Pipe symbol, denoting a choice between sections, e.g. :Engine|:Wagon.

The choice operator (pipe) binds tighter than whitespace. Therefore, the expression :foo :bar|:baz means "foo followed by either bar or baz".

The end of an expression is denoted by a period (".").

Comments can be placed anywhere whitespace is allowed and use the common multi-line comment syntax of "/*" to start and "*/" to finish.

Termination of the BNF is indicated by the identifier "EOF". Everything after this is completely ignored.

3rd-level Format BNFs

Here's the current route car spawn.bnf as an example:

/* File format information */
FILE                          = :uint,Count [{:CarSpawnerItem}] .
FILE_NAME                     = "Route Car Spawn" .
FILE_EXT                      = "carspawn.dat" .
FILE_TYPE                     = "v" .
FILE_TYPE_VER                 = "1" .

/* Base types */
CarSpawnerItem                ==> :string :uint .

/* Format types */

EOF                           /* End of file */

All BNFs for the tools are required to have the five definitions shown above, so that the various programs can use them. FILE_TYPE and FILE_TYPE_VER are the letter and number (both as strings) used in all Simis Jinx files. FILE_EXT is either a file extension (e.g. "act") or a filename (e.g. "carspawn.dat") which selects which files can contain this format. FILE_NAME is a name suitable for displaying to the user. FILE is an expression representing the root of the file - the base of all parsing.

Binary Block Type Names

While the BNFs define what is allowed where, there is still one remaining problem for the Simis Jinx Binary format - each block type is identified by a number, not a string. For this, we can turn to some other files included with the original Train Simulator - the files in UTILS\FFEDIT.

  • sidn.txt defines a few base IDs, including "core" and "train" (0 and 4 respectively).
  • coreids.tok contains a list of all core "tokens" - i.e. block type names - in order of the numerical value.
  • appids.tok is a C header which includes forms.hdr and loadstr.hdr with a token defined before and after each inclusion.
  • forms.hdr and loadstr.hdr contain lists of all MSTS tokens in numerical order.

To construct the 32bit unsigned number used in the Simis Jinx Binary file format, the base ID and the token ID (from its position) are combined with the base forming the most significant 16bits and the token the least significant 16bits. E.g. the 7th "train" token would be 0x00040007.

Conclusion

Together with the BNFs, the number-block type name mapping completes the picture for loading and saving Simis Jinx files. However, as the BNFs are of my own construction, they are necessarily incomplete and possibly still inaccurate in some areas. This has improved a lot over the past few months, and will continue to do so, providing a good, solid and generic reading and writing capability for most Simis Jinx files.

Permalink | Author: | Tags: Format, Games, Microsoft, Simis, Train Simulator | Posted: 11:55PM on Sunday, 23 May, 2010 | Modified: 12:02AM on Monday, 24 May, 2010 | Comments: 0


Simis Editor - Feedback class

Feedback makes everything better, eventually. Getting or sending feedback is, however, not always simple or usable; users need to be able to bang out simple comments easily, with no forms to fill in, whilst still providing proper context and technical information if the feedback is the result of the application malfunctioning. Feedback should also be anonymous if the user wishes. The Feedback class in the next release of Simis Editor is attempting to do this; here I'm going to outline its user-facing functionality and the back-end implementation.

Entry Points

There are two different ways the feedback process can be started:

  • From the user: a "Send Feedback..." menu item under "Help".
  • From the application: anywhere in the application that catches exceptions.

While both routes show the same dialog, the latter case collects a load more contextual information to go with the report - most obviously, the exception, but it can also take anything the catch code wants to include.

Instanciation Code

The Feedback class is really simple to use, for both cases:

    try {
       new Feedback().PromptAndSend(ownerForm);
   } catch (SomeException e) {
       new Feedback(e, "sending feedback").PromptAndSend(ownerForm);
   }

The ownerForm is used for showing the dialog modally. The class switches mode based on the arguments: none means "user feedback", Exception (exception) and String (operation) mean "application failure"; there is also a third mode where the caller provides the feedback type, operation and an IDictionary<string, string> of details.

User Dialog

The dialog is mostly the same for the two cases; the biggest change is the "faces" and introductory text. For user feedback, the introduction just explains when to include your e-mail address, as it is entirely optional.

In the application failure case, this dialog is the first thing the user sees when an operation fails, so it must explain that something's gone wrong and then why you should send the feedback at all.

As the purpose of the feedback dialog is to collect as many reports as possible, it attempts to ensure all users (or a maximum of users) are happy to send the reports by allowing the user to view all the data collected for sending. As shown below, this includes the full exception details (obviously) as well as some general system information. It also includes a user ID, which is randomly generated the first time the application intends to send feedback and which is not shared between applications (i.e. two applications that a user has installed that use this feedback system will each send a different user ID).

If the user is happy to send the report and clicks the button, an XML document is constructed, serialised and POSTed to the feedback server. The user is then given a message showing the success or failure of the feedback as a clear completion of the process.

Feedback Format

The feedback is sent as XML to make handing the data as easy as possible. This is an example of an application failure report, but user feedback reports are basically the same - just without the <details>.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<report version="1.0" uid="ipejGfrUIt5gAZ3Y" time="2010-05-31T22:13:56.4276545+01:00" type="ApplicationFailure" email="">
 <environment>
   <os version="6.1.7600.0">Microsoft Windows NT 6.1.7600.0</os>
   <processor cores="4" />
   <clr bits="64" version="2.0.50727.4927" />
 </environment>
 <application version="0.3.0.0">Simis Editor</application>
 <source file="C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\Simis Editor\Editor.cs" line="185" column="5">SimisEditor.Editor.OpenFile</source>
 <details>C:\Program Files (x86)\Microsoft Games\Train Simulator\ROUTES\JAPAN2\carspawn.dat

> From 0x00000122 - data preceding failure:
>   wnerItem( "Jp1van.s" 6 )
>   CarSpawnerItem( "Jp1van2.s" 6 )
>   )
>  
>  
>
> From 0x000001A2 - data following failure:
>  
>
> > BNF has completed.
> >
> > Available states: .
> > Current rule: <none>.
> > Current state:
> >
> >    at Jgr.Grammar.BnfState.LeaveBlock() in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.Grammar\BNF.cs:line 175
> >    at Jgr.IO.Parser.SimisReader.ReadToken() in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\SimisReader.cs:line 181
>
>    at Jgr.IO.Parser.SimisReader.ReadToken() in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\SimisReader.cs:line 196
>    at Jgr.IO.Parser.SimisFile.ReadStream(Stream stream, SimisFormat& simisFormat, SimisStreamFormat& streamFormat, Boolean& streamCompressed, SimisTreeNode& tree) in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\SimisFile.cs:line 74
>    at Jgr.IO.Parser.SimisFile..ctor(String fileName, SimisProvider simisProvider) in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\SimisFile.cs:line 32

  at Jgr.IO.Parser.SimisFile..ctor(String fileName, SimisProvider simisProvider) in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\SimisFile.cs:line 37
  at Jgr.IO.Parser.MutableSimisFile.Read() in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\JGR.IO.Parser\MutableSimisFile.cs:line 28
  at SimisEditor.Editor.OpenFile(String filename) in C:\Users\James\Documents\Visual Studio 2008\Projects\JGR MSTS Editor\Simis Editor\Editor.cs:line 185</details>
 <comments></comments>
</report>

One thing which this does not show is "attachments" - where the code calling the Feedback class specifies arbitrary extra data to include; these are sent as additional details but each with a name: <details name="extra stuff">...</details>.

Permalink | Author: | Tags: Editor, Feedback, Simis, XML | Posted: 10:42PM on Monday, 31 May, 2010 | Comments: 0

Powered by the Content Parser System, copyright 2002 - 2024 James G. Ross.