Amiga.org

Amiga computer related discussion => Amiga Software Issues and Discussion => Topic started by: orange on December 17, 2017, 05:36:04 PM

Title: MFM decode
Post by: orange on December 17, 2017, 05:36:04 PM: does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)
Title: Re: MFM decode
Post by: guest11527 on December 17, 2017, 05:56:15 PM: Quote from: orange;834180
does 'rawread' dump MFM encoded data?
Yes.
Quote from: orange;834180
if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)
That, of course, depends on the format. MFM is a 1:2 encoding, every bit is encoded by two bits. In particular, the filler bit between two data bits is 1 if and only if the two data bits are zero. Otherwise, the filler bit is 0. Now, how the data is laid out and how the data bits are spread is entirely a decision of the format, and also how the track and sector headers look like.

The format of the trackdisk.device is described in the RKRM Hardware, you find all the information there. In this format, payload data is separated into two 256-bit groups (even and odd bits), which is a rather untypical layout, but it allows fast decoding with the blitter. Typically, filler bits are interleaved with the subsequent data bits.
Title: Re: MFM decode
Post by: orange on December 17, 2017, 07:00:51 PM: thanks Thomas.
Title: Re: MFM decode
Post by: olsen on December 18, 2017, 12:31:07 PM: Quote from: orange;834180
does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)

Documented MFM decoding program code is pretty scarce (I've seen my share, and I still can't believe that the people who wrote it trusted their own code!). You might want to dip into TrackSalve (http://aminet.net/disk/misc/TrackSalve.lha), which is somewhat old, though. "TrackSalve" is a patch for trackdisk.device by Dirk Reisig, last updated in 1990, fixing Kickstart 1.x-specific bugs which by that time had already been fixed for Kickstart 2.0. "TrackSalve" covers just about everything. Bonus: it uses the blitter for encoding/decoding.
Title: Re: MFM decode
Post by: olsen on December 18, 2017, 01:43:23 PM: Quote from: Thomas Richter;834181
Yes.

That, of course, depends on the format. MFM is a 1:2 encoding, every bit is encoded by two bits. In particular, the filler bit between two data bits is 1 if and only if the two data bits are zero. Otherwise, the filler bit is 0. Now, how the data is laid out and how the data bits are spread is entirely a decision of the format, and also how the track and sector headers look like.

The format of the trackdisk.device is described in the RKRM Hardware, you find all the information there. In this format, payload data is separated into two 256-bit groups (even and odd bits), which is a rather untypical layout, but it allows fast decoding with the blitter. Typically, filler bits are interleaved with the subsequent data bits.

I just double-checked: the disk format documentation ended up in "Appendix C" of the 3rd edition "Devices" ROM Kernel Reference Manual. It seems that it was originally part of the 1st edition "Libraries & Devices" ROM Kernel Reference manual in "Appendix L", but you can't find that version online.

Anyway, here's what I found, from way back (1985):

Code: [Select]
COMMODORE-AMIGA DISK FORMAT The following are details about how the bits on the Commodore-Amiga disk are actually written. Gross Data Organization: 3 1/2 inch disk double-sided 80 cylinders/160 tracks Per-track Organization: Nulls written as a gap, then 11 sectors of data. No gaps written between sectors. Per-sector Organization: All data is MFM encoded. This is the pre-encoded contents of each sector: two bytes of 00 data (MFM = AAAA each) two bytes of A1* ( "standard sync byte" -- MFM encoded A1 without a clock pulse ) (MFM = 4489 each) one byte of format-byte (Amiga 1.0 format = FF) one byte of track number one byte of sector number one byte of sectors until end of write (NOTE 1) [above 4 bytes treated as one longword for purposes of MFM encoding] 16 bytes of OS recovery info (NOTE 2) [treated as a block of 16 bytes for encoding] four bytes of header checksum [treated as a longword for encoding] four bytes of data-area checksum [treated as a longword for encoding] 512 bytes of data [treated as a block of 512 bytes for encoding] NOTES: NOTE 1. The track number and sector number are constant for each particular sector. However, the sector offset byte changes each time we rewrite the track. The Amiga does a full track read starting at a random position on the track and going for slightly more than a full track read to assure that all data gets into the buffer. The data buffer is examined to determine where the first sector of data begins as compared to the start of the buffer. The track data is block moved to the beginning of the buffer so as to align some sector with the first location in the buffer. Because we start reading at a random spot, the read data may be divided into three chunks: a series of sectors, the track gap, and another series of sectors. The sector offset value tells the disk software how many more sectors remain before the gap. From this the software can figure out the buffer memory location of the last byte of legal data in the buffer. It can then search past the gap for the next sync byte and, having found it, can block move the rest of the disk data so that all 11 sectors of data are contiguous. Example: first-ever write of the track from a buffer like this: <GAP> |sector0|sector1|sector2|.....|sector10| sector offset values: 11 10 9 .... 1 (if I find this one at the start of my read buffer, then I know there are this many more sectors with no intervening gaps before I hit a gap). sample read of this track: <junk>|sector9|sector10|<gap>|sector0|...|sector8|<junk> value of 'sectors till end of write': 2 1 .... 11 3 result of track realligning: <GAP>|sector9|sector10|sector0|...|sector8| new sectors till end of write: 11 10 9 ... 1 so that when the track is rewritten, the sector offsets are adjusted to match the way the data was written. NOTE 2. This is operating systems dependent data and relates to how AmigaDos assigns sectors to files. Reserved for future use. GENERAL: When data is MFM encoded, the encoding is performed on the basis of a data block-size. In the sector encoding described above, there are bytes individually encoded; three segments of 4 bytes of data each, treated as longwords; one segment of 16 bytes treated as a block; two segments of longwords for the header and data checksums; and the data area of 512 bytes treated as a block. When the data is encoded, the odd bits are encoded first, then the even bits of the block. (Make a block of bytes formed from all odd bits of the block, encode as MFM. Make a block of bytes formed from all even bits of the block, encode as MFM. Even bits are shifted left one bit position before being encoded.) SOURCE CODE FOR DATA ENCODE/DECODE decodeBlock( mfmbuffer, userbuffer, numwords ) WORD *mfmbuffer; /* the encoded data */ WORD *userbuffer; /* where to put the decoded data */ int numwords; /* the number of WORDS of data (not bytes) */ { WORD *oddptr, *evenptr, oddbits, evenbits; oddptr = mfmbuffer; /* the even region starts right after the odd one */ evenptr = &mfmbuffer[numwords]; while( numwords-- > 0 ) { /* mask off the mfm clock bits, and shift the word */ oddbits = ((*oddptr++ << 1) & 0xAAAA); /* even bits are already in the right place. Just mask off clock */ evenbits = ((*evenptr++) & 0x5555); /* recombine the two sections */ *userbuffer++ = oddbits | evenbits; } } encodeBlock( mfmbuffer, userbuffer, numwords ) WORD *mfmbuffer; /* where to put the encoded data */ WORD *userbuffer; /* the user data, before encoding */ int numwords; /* the number of WORDS of data (not bytes) */ { WORD *oddptr, *evenptr; WORD *ubuf; oddptr = mfmbuffer; /* the even region starts right after the odd one */ evenptr = &mfmbuffer[numwords]; /* mfmencode takes one word of mfm data can correctly sets * the clock bits */ /* encode the odd bits */ for( ubuf = userbuffer, i = numwords; i > 0; i-- ) { oddptr++ = mfmencode( (*ubuf++ >> 1) & 0x5555 ); } /* encode the even bits */ for( ubuf = userbuffer, i = numwords; i > 0; i-- ) { evenptr++ = mfmencode( *ubuf++ & 0x5555 ); } }
Documentation on how the sector header and data area checksums are calculated remains elusive, I'm afraid.
Title: Re: MFM decode
Post by: guest11527 on December 18, 2017, 02:13:07 PM: Quote from: olsen;834189
Documentation on how the sector header and data area checksums are calculated remains elusive, I'm afraid.

For German readers, there is the Databecker "Floppybuch" for the Amiga which contains this information. I'm in general pretty careful with second sources, especially Databecker (you find a lot of nonsense in these books), but this one is pretty complete (but also contains nonsense you better filter out).
Title: Re: MFM decode
Post by: olsen on December 18, 2017, 03:04:28 PM: Quote from: Thomas Richter;834190
For German readers, there is the Databecker "Floppybuch" for the Amiga which contains this information. I'm in general pretty careful with second sources, especially Databecker (you find a lot of nonsense in these books), but this one is pretty complete (but also contains nonsense you better filter out).

According to the "TrackSalv" source code the respective checksums are calculated for the MFM-encoded header/sector data, respectively.

I think that the checksum algorithm works as follows:

Code: [Select]
ULONG checksum(const ULONG * encoded_words,int num_words) { const ULONG mask = 0x55555555; ULONG sum; sum = 0; while(num_words-- > 0) sum ^= (*encoded_words++); sum = ((sum >> 1) & mask) ^ (sum & mask); return(sum); }

The XOR operation is quite handy here, I suppose, since it works regardless of whether the MFM fill bits are present or not. This is not the case for the IBM PC floppy disk format, which uses CRC values.

It might be worth looking up the old Amiga 68k NetBSD/Linux kernel floppy driver code for reference.
Title: Re: MFM decode
Post by: olsen on December 18, 2017, 04:03:24 PM: Quote from: orange;834180
does 'rawread' dump MFM encoded data? if so, how to decode it with given sync, track size, etc..? (not a standard 880K diskette)

If I understand this correctly, you can tell trackdisk.device in TD_RAWREAD mode to start reading as soon as it finds the sync pattern of your choice. This should save you the trouble to find the beginning of the sector, which can be shifted by 1..15 bits.

You do need to know the sector size that is going to be used, though. In the standard Amiga format you'll encounter 32 bytes of header data in addition to the 512 bytes of sector data, including the sync pattern (four bytes total) which introduces the header. In this format you'll need (32+512) * 2 = 1088 bytes worth of memory to read the MFM-encoded data.
Title: Re: MFM decode
Post by: orange on December 18, 2017, 08:42:07 PM: what does ' length 12656/4' mean in output?
how long is the header, what is its format?
thanks.

edit: is it like this:

OFFSET Count TYPE Description
0000h 8 byte 'UAE-1ADF'
0008h 4 byte trackcount
000Ch 4 byte 0=amigados 1=raw mfm
0010h 4 byte tracklength
0014h 4 byte tracklength in bits
0018h 4 byte 0=amigados 1=raw mfm
...

I just cant find '0xAAAA AAAA 4489 4489' :(

( http://lclevy.free.fr/adflib/adf_info.html#p23 )
Title: Re: MFM decode
Post by: olsen on December 19, 2017, 08:44:14 AM: Quote from: orange;834196
what does ' length 12656/4' mean in output?
how long is the header, what is its format?
thanks.

edit: is it like this:

OFFSET Count TYPE Description
0000h 8 byte 'UAE-1ADF'
0008h 4 byte trackcount
000Ch 4 byte 0=amigados 1=raw mfm
0010h 4 byte tracklength
0014h 4 byte tracklength in bits
0018h 4 byte 0=amigados 1=raw mfm
...
Shrug... this does not look like anything I would expect to find on a standard Amiga formatted floppy disk. Are you sure you are looking for MFM data? If this is the data structure layout, I would expect it to be a container format, not the contents.

Quote
I just cant find '0xAAAA AAAA 4489 4489' :(

( http://lclevy.free.fr/adflib/adf_info.html#p23 )

You may not be able to see this pattern in the encoded MFM data at all. The thing is, this is a bit pattern, not a byte pattern. It can start in the MFM bit stream at virtually any position in the track buffer, but usually it's somewhere near the beginning of the buffer.

So, how do you find the bit position where it starts? The key is the 0xAAAA pattern, which either shows up as 0xAAAA in the MFM bit stream (if the header starts at an even bit position), or as 0x5555 (if it starts at an odd bit position).

The first step to decoding is to find out where the 0xAAAA bit pattern shows up. Because it covers 32 bits, you should be able to find it by looking for any two consecutive bytes which either read as 0xAA or as 0x55.

Code: [Select]
UWORD * mfm_buffer; int mfm_buffer_size, i; int num_words = mfm_buffer_size / sizeof(*mfm_buffer); UWORD pattern; int word_position = -1; for(i = 0 ; i < num_words ; i++) { if (mfm_buffer[i] == 0xAAAA) { pattern = 0xAAAA; word_position = i; break; } else if (mfm_buffer[i] == 0x5555) { pattern = 0x5555; word_position = i; break; } } /* Skip the pattern if it shows up again, which happens * if it started at the very first bit of the byte. */ if(word_position != -1 && word_position + 1 < num_words && mfm_buffer[word_position+1] == pattern) word_position++;
If these two bytes are part of a sector header, then they should be followed by two 0x4489 bit patterns in the next 0..14 bits. You need to figure out which bit position they show up at.

Code: [Select]
if(word_position != -1 && word_position + 1 < num_words) { int bit_position = -1; ULONG match; match = (((ULONG)mfm_buffer[word_position]) << 16) | mfm_buffer[word_position+1]; for(i = 0 ; i < 15 ; i++) { if(((match << i) & 0xFFFF0000) == 0x44890000) { bit_position = i; break; } } }
At this point you should be able to tell if you found the byte and bit positions of the first 0x4489 sync bit pattern. The next step would be to check if the first 0x4489 pattern you found is followed by another one. If that's the case, you can begin to read the individual words, shift them as needed and reconstruct both the sector header and sector data in their MFM-encoded forms.

Please note that in production code the task of finding the sync words is usually table-driven and does not run in a loop which shifts bits around ;)
Title: Re: MFM decode
Post by: orange on December 19, 2017, 08:56:34 AM: Quote from: olsen;834214
Shrug... this does not look like anything I would expect to find on a standard Amiga formatted floppy disk. Are you sure you are looking for MFM data? If this is the data structure layout, I would expect it to be a container format, not the contents.

that is an output of rawread command, it writes 'extended' ADF format, at least when raw tracks are present.

Quote
You may not be able to see this pattern in the encoded MFM data at all. The thing is, this is a bit pattern, not a byte pattern. It can start in the MFM bit stream at virtually any position in the track buffer, but usually it's somewhere near the beginning of the buffer.

thanks. I was searching at bit-level, but will try again. perhaps rawread removes the sync?

Quote
So, how do you find the bit position where it starts? The key is the 0xAAAA pattern, which either shows up as 0xAAAA in the MFM bit stream (if the header starts at an even bit position), or as 0x5555 (if it starts at an odd bit position).

The first step to decoding is to find out where the 0xAAAA bit pattern shows up. Because it covers 32 bits, you should be able to find it by looking for any two consecutive bytes which either read as 0xAA or as 0x55.
...

thanks. will try.

I've tried encoding 'DOS' to MFM data (after splitting to odd and even bits?bytes?), but can't find the bit pattern in input.
Title: Re: MFM decode
Post by: olsen on December 19, 2017, 10:46:15 AM: Quote from: orange;834215
that is an output of rawread command, it writes 'extended' ADF format, at least when raw tracks are present.
OK, so this is a container format after all.

Quote
thanks. I was searching at bit-level, but will try again. perhaps rawread removes the sync?
I don't know how the "rawread" command works (any pointers to the source code?), but if it uses the standard Amiga MFM encoded format, then it could drop the sync patterns because they are redundant in this container. Mind you, the sector header and sector data would still have to be preserved in properly-shifted form.

Quote
I've tried encoding 'DOS' to MFM data (after splitting to odd and even bits?bytes?), but can't find the bit pattern in input.
The odd and the even bits are encoded separately and stored separately (512 bytes apart). The encoding of the first bit of "DOS" may vary, depending upon the bit which preceded it. "D" = binary 01000100, which comes out as odd=0000 and even=1010 prior to encoding. That could be encoded either as odd=10101010 or odd=00101010 depending upon the preceding bit (sector data checksum) and even=01000100. So there's already a bit of ambiguity here.
Title: Re: MFM decode
Post by: orange on December 19, 2017, 01:32:03 PM: ok, thanks.
finally found the problem.
I was using:
$bitdata = unpack "b*",$data;
instead of
$bitdata = unpack "B*",$data;

in perl :/
Title: Re: MFM decode
Post by: kolla on December 19, 2017, 03:08:21 PM: Quote from: orange;834223

finally found the problem.
...
perl :/

Indeed :D
Title: Re: MFM decode
Post by: olsen on December 19, 2017, 03:29:05 PM: Quote from: orange;834223
ok, thanks.
finally found the problem.
I was using:
$bitdata = unpack "b*",$data;
instead of
$bitdata = unpack "B*",$data;

in perl :/

Oh well... give 'C' a try, please. Only a fraction of the expressiveness that leads Perl users to their doom, but the same degree of catastrophic errors easily triggered by a mere single wrong character that is abstrusely difficult to spot ;)
Title: Re: MFM decode
Post by: orange on December 19, 2017, 06:05:06 PM: sure, will do, thanks olsen :)
just need to figure out format of the other, nonAmiga, diskette first.

edit:
it seems to use this format: http://nerdlypleasures.blogspot.rs/2015/11/ibm-pc-floppy-disks-deeper-look-at-disk.html

I've managed to decode the whole diskette, but couldn't check CRC of individual sectors data.
Title: Re: MFM decode
Post by: olsen on December 21, 2017, 02:40:51 PM: Quote from: orange;834230
sure, will do, thanks olsen :)
just need to figure out format of the other, nonAmiga, diskette first.

edit:
it seems to use this format: http://nerdlypleasures.blogspot.rs/2015/11/ibm-pc-floppy-disks-deeper-look-at-disk.html

I've managed to decode the whole diskette, but couldn't check CRC of individual sectors data.

The source code for messydisk.device is available from AmiNet (http://aminet.net/disk/misc/MSH-1.58.lha), and the sector decoding including the CRC processing can be found in the "dev/devio2.c" file, specifically the CMD_Read() function. The CRC values are decoded along with the track data (in "dev/devio1.a") and eventually checked in CMD_Read().
Title: Re: MFM decode
Post by: orange on December 21, 2017, 06:43:54 PM: thanks, excellent, finally it works.
I didn't know that:
Quote
; The CRC is computed not only over the actual data, but including
; the SYNC mark (3 * $a1) and the 'ID/DATA - Address Mark' ($fe/$fb).
Title: Re: MFM decode
Post by: olsen on December 22, 2017, 02:09:39 PM: Quote from: orange;834286
thanks, excellent, finally it works.
I didn't know that:

So, you are going to publish the fruits of your research, aren't you? :)
Title: Re: MFM decode
Post by: orange on December 22, 2017, 02:20:03 PM: erm, I only got perl for now, it decodes raw .adf into Tim011 image ( CP/M )
so the 'rawread' output is input for this thing.
it probably could have been one-liner :)

Quote

#!/usr/bin/perl
# decode MFM TIM011 SECTOR SIZE= 1024bytes, sectors_per_track=5

use strict;
use warnings;
use Digest::CRC qw(crc32 crcccitt crcccitt_hex crc16 crc16_hex crc_hex crc8_hex);

use utf8;
use bytes;

# OFFSET Count TYPE Description
# 0000h 8 byte UAE-1ADF
# 0008h 4 byte trackcount

# 000Eh 4 byte 0=amigados 1=raw mfm
# 0010h 4 byte tracklength
# 0014h 4 byte tracklength in bits
# ..

my ( $i, @tracks, $trackdata, $trackcount, $header, $bitdata, @sector, @track, $image, $data );
my ( $file_in, $file_out ) = ( $ARGV[0], $ARGV[1] );
my $syncbit = unpack "B*", "\xAA\xAA\xAA\xAA\x44\x89\x44\x89\x44\x89";
my $sector_per_track = 5;
my $sector_length = 1024;

{local $/; open( FILE0, "< :raw :bytes", "$file_in" ) or die "Can't open for reading: $!\n"; $data = ; close(FILE0);}
$header = substr ($data, 0, 2004); # header = 8+4+166*12 = 2004
$trackcount = unpack "N", substr ($data, 8, 4);
$trackdata = substr ($data, 12, 12*$trackcount);
@tracks = $trackdata =~ /.{12}/gs;
$data = substr ($data, 2004); # print "\n\ndatalen1=" . length $data . "\n";# exit;
$bitdata = unpack "B*",$data;

for my $i (0 .. 159) { #159
my $tracklenbit = unpack "N", substr ($tracks[$i], 8, 4);
print "\ntrack=$i, tracklen=$tracklenbit \n";
my $trackdata = substr ($bitdata, 0, $tracklenbit);
$bitdata = substr ($bitdata, $tracklenbit);

for my $j (1 .. $sector_per_track) { # 5
my $pos = index ($trackdata, $syncbit)+length($syncbit);
my $sector_id = substr ($trackdata, $pos, 16+ 4*8*2); #16:FE
$sector_id =~ s/.(.)/$1/gs; #strip clock
$sector_id = pack "B*",$sector_id;
my ($track_num, $side_num, $sector_num, $sector_s ) = unpack("xc1c1c1c1", $sector_id);
print " sector=$sector_num, sector size=$sector_s \n" ;

$pos = index ($trackdata, $syncbit, $pos)+length($syncbit);
my $sector_dat = substr ($trackdata, $pos+16, $sector_length*8*2+32); # 16:FB 32:CRC=C74C
$trackdata = substr ($trackdata, $pos);
$sector_dat =~ s/.(.)/$1/gs; #strip clock
$sector[$sector_num] = pack "B*", substr ($sector_dat, 0, $sector_length*8);
my $crc = unpack "n*", pack "B*", substr ($sector_dat, $sector_length*8);
my $crc_chk = crcccitt( "\xA1\xA1\xA1\xFB" . $sector[$sector_num]);
if ( $crc eq $crc_chk) { print "CRC OK!\n"; }
else { print "CRC ERROR! , $crc, $crc_chk \n\n" ; exit; }
}

$track[$i] = join ("", $sector[17], $sector[18], $sector[19], $sector[20], $sector[21]);
}

$image = join ("", @track);
{local $/; open( FILE0, "> :raw :bytes", "$file_out" ) or die "write err: $!\n"; print FILE0 $image; close(FILE0);}

exit;
Title: Re: MFM decode
Post by: bloodline on February 15, 2018, 11:59:23 AM: Quote from: orange;834196
what does ' length 12656/4' mean in output?
how long is the header, what is its format?
thanks.

edit: is it like this:

OFFSET Count TYPE Description
0000h 8 byte 'UAE-1ADF'
0008h 4 byte trackcount
000Ch 4 byte 0=amigados 1=raw mfm
0010h 4 byte tracklength
0014h 4 byte tracklength in bits
0018h 4 byte 0=amigados 1=raw mfm
...

I just cant find '0xAAAA AAAA 4489 4489' :(

( http://lclevy.free.fr/adflib/adf_info.html#p23 )

I've tried rawread, it doesn't seem to output the MFM data, it just produces and ADF with some extra metadata.
Title: Re: MFM decode
Post by: orange on February 15, 2018, 05:07:56 PM: there is an option to force 'raw' mode. if it still doesn't work, you can try with some NDOS diskette.
Title: Re: MFM decode
Post by: bloodline on February 15, 2018, 05:37:42 PM: Quote from: orange;836158
there is an option to force 'raw' mode. if it still doesn't work, you can try with some NDOS diskette.

I'm testing with a normal OFS floppy...

I did use the -r option... but the output is still ordinary bytes not encoded data :(
Title: Re: MFM decode
Post by: orange on February 15, 2018, 06:34:22 PM: Quote from: bloodline;836160
I'm testing with a normal OFS floppy...

I did use the -r option... but the output is still ordinary bytes not encoded data :(

try -1 and maybe -x
Title: Re: MFM decode
Post by: bloodline on February 15, 2018, 11:24:55 PM: Quote from: orange;836167
try -1 and maybe -x

Good call, using -x -1 I definitely get mfm data! But the first sync word is at 370,076 bytes into the file!! that can't be track 0??
Title: Re: MFM decode
Post by: orange on February 16, 2018, 03:49:13 PM: have you searched bit-by-bit? that's the hardest part. if yes, perhaps it strips sync. dunno, check source. I couldn't understand it comnpletely, perl is much easier. amiga does some weird 'interlacing', too.