Welcome, Guest. Please login or register.

Author Topic: Extensible file format spec ...  (Read 2887 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16879
  • Country: gb
  • Thanked: 5 times
    • Show all replies
Re: Extensible file format spec ...
« on: September 05, 2006, 10:37:33 AM »
@Jose

I have something you could use that I used to solve machine independent, extensible binary storage some time ago that I called 'xsf' for extensible storage format. However, it is C++ based.

The very root is the idea of endian aware XSFStreamIn and StreamOut classes that provide protected methods for block reading and writing varying length data items endian safe. A file header comprised of only byte values contains some basic information about the endian nature of the file (by default the signature of the machine creating the file but you can override this), the expected data alignment and some other properties.

On top of this, there is an XSFStorable class which is able to access the IO methods of the above streams. It defines a basic wrapper for an object you want to be able to store and you get the properties by inheriting XSFStorable in your target class. What this class is, is entirely up to you, it can be all kinds of data. You just use the XSFStorable methods to define how it is serialized and unserialized, which in turn use the block IO routines. The serialized XSFStorable object becomes a well defined chunk within the file, complete with the XSFStorable header information that the system uses when parsing files.

The basic file structure and XSFStorable itself provides some basic type information that allows the system to parse serialized XSFStorable chunks it doesn't recognise (at the very least you can skip past them).

Lastly, the amigaos version's IO routines are realised using asynchronous double buffered routines similar to asyncio.library (in fact based on some old example stuff that I think became the basis of said library too) but also providing on the fly endian conversion for block reads and writes (byteswapping copy, if you prefer) where needed. Naturally those bits use asm ;-)

If you are interested I can send you the code, but as it is part of a larger system, you'll need to pull out the bits you need.
int p; // A
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16879
  • Country: gb
  • Thanked: 5 times
    • Show all replies
Re: Extensible file format spec ...
« Reply #1 on: September 06, 2006, 08:11:25 PM »
One of the standard XSFStorable classes was a catalogue class. If a file contained multiple serialised objects, the catalogue tracked them. The catalogue chunk was always the very last chunk in any file that had one. If you deleted a chunk, its space was up for recycling. If it was too big to fit in any gap, it went to the end of the file (as in the catalogue gets overwritten and then the updated one added at the new end of file).

I wrote a simple CLI tool that would optimise any XSF file by collapsing out the free space and updating the catalogue.

I extended it with an experimental free space tracking one to make a more effective system but I never intended to make some sort of replacement 'in file' filesystem. 99% of the time a file only ever had one serialized object in it anyway :-)

-edit-

Its also noteworthy that the catalogue was totally optional. Nothing prevented you having many chunks in a file. If you attempted to read an object, the system would skip to the next chunk from wherever it was (because you never ever read directly, only unserialize, you can only ever be 'on' a chunk. Of course it would scan if this wasnt the case, perhaps a serialized chunk has an invalid rawSize).

I'll see if I can find my design docs.
int p; // A