# MTH Proto-Sound .mth file structure



## sporadic

I'm splitting this off from my original thread here, as this section seemed more appropriate and I figured a proper thread may gather some attention. For curiosity's sake more than anything (and the fact I just like doing this stuff), I started researching and tearing into the .mth soundset files to see if I could alter / extract audio data. My end goal is to have a nice steam turbine soundset for the Bantam S2 steamer I picked up off eBay which unfortunately has a puffer soundset. Just started on this yesterday, but at first glance there seems to be a 32kB header / data block with audio data beginning at 0x8000. The audio itself is ADPCM (Adpative Differential Pulse Code Modulation) at varying sample rates. Most of the audio seems to be sampled at 5.5Hz and 11kHz. I've been able to extract the data and play it back by importing it into Audacity as raw VOX ADPCM. As of now, it's just a continuous stream as I'm still working on figuring out the header section and nailing down the offset references. I'm assuming smoke, lighting control, throttle curves are in the first 32kB section as well? Curious if anyone else out there has worked on this before or may have ideas? Here's a section from the bantam s2 passenger rendered as mp3 - http://forkineye.com/wp-content/uploads/2013/12/s2pass_crew.mp3


----------



## gunrunnerjohn

I think one of the tricks is the audio segments are doubtless just random length audio clips that have a pointer to each unique one. The offsets could be different for each locomotive, and replacing them will require you to dig into the code and find the offset table as well.

I've never looked into this, but I'll be watching to see what you find. 

Soon I may have some interesting technical posts about a project that I'm working on, but it's not ready for prime time just yet.


----------



## sporadic

After banging my head on it for a few days off and on, I managed to figure out the audio directory structure. Starting at 0x0100 there is a series of 16 byte records running up through 0x10E0. I don't have all the fields nailed down yet, but audio sample location and length seems to be spot on. Everything is referenced by frames (16bit) and sample offset count. I've tried this with a few PS2 and PS3 files and they all seem to map correctly. The following is an example of the first 3 records for a PS2 sd70ace file I pulled (R092PF3SD70ACe_ALL100107aF2X.MTH) that should explain it a little better. There's more data further in the header section I haven't worked on yet, but I'm now able to pull all discrete audio from the file. I'm a ways off from modifying or porting soundsets, but it definitely seems doable.



Code:


Frames are 16bit (65,536 byte frame size).
Audio is referenced as Frame number + offset in sample count.
Audio samples are 4bit ADPCM (i.e. 512 bytes = 1024 samples).
All calculated offsets for audio records are in reference to the beginning of the audio data which is at 0x8000 in the .mth file.

//1st audio record
0x0100: 01		// Frame (Start)
0x0101: 00 00 		// Offset sample count into frame for start of audio data (16bit big endian)
0x0103: FF FF FF	// majority of records defined this way. when not, references a frame + offset as above.  unknown, maybe loop data
0x0106: FF FF FF	// majority of records defined this way. when not, references a frame + offset as above.  unknown, maybe loop data
0x0109: 00 00 00	// majority of records defined this way. when not, pattern is apperent but haven't analyzed it yet.
0x010C: 01		// Frame (End)
0x010D: CE 79		// Offset sample count into frame for end of audio data (16bit big endian)
0x010F: 0B		// not known yet. seems to be a bitmask or referenced in nibbles. probably sample rate and other flags.
//result: start @ 0; end @ 52875; 52875 samples

//2nd audio record
0x0110: 01
0x0111: CE 7A
0x0113: FF FF FF	
0x0116: FF FF FF	
0x0119: 00 00 00	
0x011C: 02		// second frame, so add 65,536 to offset
0x011D: F4 6B
0x011F: 0B
//result: start @ 52876; end @ 128107; 75231 samples

//3rd audio record
0x0120: 02
0x0121: F4 6C
0x0123: FF FF FF
0x0126: FF FF FF
0x0129: 00 00 00
0x012C: 03		// 3rd frame, so start at 65,536*2 this time (frame_size*(n-1))
0x012D: D5 03
0x012F: 0B
//result: start @ 128108; end @ 185603; 57495 samples


----------



## gunrunnerjohn

That looks great, if you get to being able to mod the sound sets, that will be impressive,

One stumbling block is they're almost sure to have a checksum or CRC for the loading function, so you'd have to figure that out as well.


----------



## sporadic

Found some more tidbits last week when exploring newer PS3 files. Found some audio that wasn't ADPCM! Took some digging and bit fiddling, but ended up 24bit PCM in big endian. So it seems that page and offset references are by nibble count, not necessarily sample rate. Not sure why the 4bit references yet, but it seems to hold true. I threw together a little C# app and am able to extract all the raw audio data into separate files as well as play the 24bit PCM within the app. Once I get the ADPCM decoding and playing within the app I'll post something for people to play with. Been looking for checksum data, but haven't located it yet. Nothing in the upper header anyways. Still more stuff to parse between the directory structure and audio data.


----------



## gunrunnerjohn

I can assure you there's a CRC or checksum in there somewhere.  They check blocks as they are loaded, so it may be necessary to camp on the data line and decode them that way to find out how they do it.


----------



## sporadic

gunrunnerjohn said:


> I can assure you there's a CRC or checksum in there somewhere.  They check blocks as they are loaded, so it may be necessary to camp on the data line and decode them that way to find out how they do it.


If there is a checksum defined in the file, then either the TIU or engine is validating it as the DCS Loader software just checks flash memory size and engine type. I don't have a DCS setup yet, but my bet would in the TIU if at all. Although I would expect a checksum in the files are you positive its there? Have you experienced that error on corrupted files before or was it a DCS comms issue? A quick test would be to change any byte in the audio data (anything past 0x8000).


----------



## gunrunnerjohn

I have experienced errors loading several times, each block is checked in some fashion, and I'm assuming a checksum or CRC. I have done tons of downloadable software routines, and it's like Russian Roulette to download with no checking.

I do have the DCS set, and I load lots of locomotives as I do repairs and upgrades.


----------



## sporadic

More than likely that's just DCS doing what its supposed to do. Reading over the DCS patent documentation in regards to the spread spectrum signalling used for track comms, they have some pretty robust error checking between the engine and TIU. Given the noisy electrical environment I would expect one would see their fair share of errors when transferring such large amounts of data on an AC signal (or switched DC) via something like train track. Not saying nothing is being checked (lots of checking in the patent docs), just saying I haven't seen checksum data (or proof of) in the files themselves. All the DCS loader does is some size checking, looks a few header bytes for engine type, unlocks the TIU, then send the data while monitoring feedback form the TIU. As a software engineer, if I were going to have a file format containing a checksum, that'd be the first thing I check before I allow the user to do anything....


----------



## gunrunnerjohn

It's quite possible the loader is generating it on the fly as it formats blocks, but I'm 100% sure there is checking. It's quite possible they assume the file on disk is intact and just generate the checksum/CRC on the fly and then check it on the other end. 

That would make hacking the sound files a lot easier.


----------



## Mark DiVecchio

Is anyone still interested in pursing this topic?

I looked at my 20-3128-1 2-8-4 A-2a Berkshire. I was able to use Audacity to play the sounds in this file.

Was the C# program mentioned above ever completed?


----------



## gunrunnerjohn

I'd love to be able to extract the sounds, I don't know that the OP ever did anything more with it. Being able to patch in some different sounds to MTH files would be very cool!


----------



## Mark DiVecchio

There are many hex dump progs out there. I used this one:

http://www.fileformat.info/tool/hexdump/index.htm

At location hex 1900 of the Berkshire sound file, I can see the hotkey codes.


----------



## gunrunnerjohn

The real question is, how to insert new sounds.


----------



## Severn

I think a simple test would be to take an existing .mth file. And identifying approximately the first clip which starts at 0x8000 -- change the values. Perhaps just writing all 0s or all 1s -- or something more sophisticated. Play it back through audacity to hear the difference.

And then upload it to your engine.

If it works, then in theory clips can be changed -- at least within an existing "clip boundary".


----------



## gunrunnerjohn

I suspect the sticking point is the checksum or CRC, whatever checking they use. Generally, this would be generated statically as part of the file.


----------



## Severn

I would guess you are correct. But I don't know where such is in the file. But at least one could try it and see... maybe there isn't any or its done at send time, or some such....


----------



## Severn

Dumb question: has anyone just asked MTH for a .mth format description document?


----------



## gunrunnerjohn

I guess you haven't dealt with MTH.  They don't release technical details, neither does Lionel.


----------



## Severn

Ok, I've heard this about them. But it seems to me that as long as you have to buy a TIU from them, it [whatever it is] doesn't really undercut them. 

Describing the format of the .mth file would in my mind not mean anything to them from that point of view, I can't see how it would threaten their bottom line. 

If anything it might sell a few more engines to folks who find it novel to personalize an engine's sound in some manner.


----------



## gunrunnerjohn

Well, what it seems to you doesn't change their behavior. MTH is VERY protective of their intellectual property rights, and they don't release stuff like that without jumping through a lot of legal hoops.


----------



## Mark DiVecchio

I hope that sporadic shows up again here. He gave me a lot of hints about the information in the sound file. I wrote a C++ program that analyzes the sound file and lets you play segments at different sample rates. As it goes with my "nothing new under the sun" philosophy, I found most of the hard ADPCM code on the internet and the source files include acknowledgements to the original authors. Zip file with sources (Borland C++ Builder) and executables on my web page at:

http://www.silogic.com/trains/ADPCM.html

Right now, only handles 4 bit mono ADPCM. Sporadic mentioned that some of the newer engines use a different code. Haven't looked at that yet.


----------



## Severn

A+ for you!!! It's a great start!

One issue: when invoked it's looking for vcl40.bpl. I found this on of one those "dll.com" websites and trepidatiously downloaded and copied it to the same directory as the executable, and it was happy.

Second (possible) issue (that could be far worse): I believe the addresses calculated are not quite right somehow. I tried this myself, couple different ways, first going by the original post, then just fiddling about...

I could not get it exactly that my ear could tell in terms of isolating the first clip in my sound file. It somehow did not quite add up.

So... 

In your program now, its first clip plays as 3 clips in my sound file to my ear. It plays a diesel start up sound (with a kind of warning buzzer, going off and on), a possible turning it off sound which I see in audcity at least as the 2nd clip, and then a coupler sound 3rd clip (there might even be another sound before this, but it goes by so quickly it's hard to say).

A couple of days ago I tried really really hard to find the boundary of the first clip in my file by my ear using audacity and ... not only did this turn out to be harder than I imagined -- but it was only thru trial and error of guessing at byte lengths, saving the buffers captured off to a file and running the results through Audacity over and over, that I finally got to the edge of it in terms of data... 

Or so it sounded and looked in the "sound graph" to me...

Still -- not being critical to what you've done at all! Just trying to be helpful.

I can't though really offer any new insights much that are that helpful.

After staring at it a bit the other night I noticed this (weak) pattern:

The area is a series of 16-byte "fields". We know that.

And there are seemingly three kinds of 16-byte patterns to it.

One kind looks something like this, these always end in a 'b', the last nibble that is... 

01 00 00 ff ff ff ff ff ff 00 00 00 02 af 11 0b

The first 3 bytes and 13th-15th bytes we are interpreting as a kind of frame ptr. We ignore the other ones for now.

But the second kind look like this and have much in them, and end 'f''.

03 d5 04 03 e9 1b 03 e5 c8 16 ff b4 03 fa c5 0f

And finally the "we aren't using this " kind rows -- or so it seems, there are a lot of these and they end in 0. Here the magic field mentioned above is all 0s.

ff ff ff ff ff ff ff ff ff 00 00 00 ff ff ff 00

Sorry, that's not terribly helpful.


----------



## Severn

Here's what my one engine's sound file looks in audacity (see attachment). I've provide a few labels although this is not exact or complete but illustrative.

Mostly the "hard edges" I've interpreted as separate clips. In many cases this is obvious, in some not so much. Partly to see the whole thing, the edges are not always so clear when zoomed out.

Basically the edges correspond in many cases to a transition to a clearly different sound record.

Further I can play via Mark's excellent RTC program or the remote many of these engine and crew sounds as separate items.

If memory serves there are about 10 engine sounds, not counting revving and maybe 6 or so crew sounds. Given the 21 selections available on RTC's dials for each, I feel like there could be more sounds if only they'd added them... 

Anyway, you are now bored, so I'll stop...


----------



## Mark DiVecchio

I appreciate all of your comments. Good to have a second set of eyes on this.

I calculate addresses for "Start", "1", "2", and "End" using the scheme that sporadic
came up with : (frame number * 65,536 byte) + 16 bit offset. 

I treat the offset as a byte count. Sporadic called it a Sample Count but it can't
be. Each segment is 65K bytes and the offset must fit in to 16 bits which is
65K. 

So listening to some of my parsings of the file, you are right, phrases get
chopped, etc.

So I experimented, and sporadic didn't have it quite right.

Frames are actually 0x8000 (15 bit or 32,768) bytes. Or the other way to look
at it is they are 65536 samples. Offsets into the frame are
in samples (not bytes) so you need 16 bits to count all of the 4bit samples
in the frame.

New version of my parser program v1.0.2 on my web page. It works much
better now.

The last byte of the 16 byte index entry seems to include information about 
the sample rate. In my sound files, I've found 5500, 11000 and 22050 Hz
segments.


----------



## Severn

Fantastic! That's pretty much done it I think!! Except...

A few still seem clipped to me. One is a crew talk clip and goes like this to my ear in RTC: "They've cleared us to go whenever you're ready."

However in this software "ready" is clipped off" 
"They've cleared us to go whenever you're" [clip]. 

But I need to verify it... a few others to me end more abruptly than my memory but most are spot on! (since so many seem fine this seems odd... maybe my brain just fills in what it thinks should be there!)

A number of misc. observations:

Still a bit baffled by the last byte. I believe these are 0E, 1E, 2E, 4E, 5E ... 5E seem to be 5500 hz, and all the rest are 11000. 

Then there's mysterious 3 byte value that's all 00s in most of except for a few entries that have values there (these end in F on the last byte).

And there's more:

There are duplicate clips in mine at least. Why? 

The engine rev up and rev down sounds seem to occur in one large block... but this block is not obviously tagged as such... perhaps such sound always at these locations... or, ?

There are a lot of very short faint sounds... from RTC I can mostly not here these while the thing is running, even with engine sounds off... why bother putting such in if you can't even hear it!

Is there an order to it? Some TOC someplace maybe... how does train wreck get associated with "train wreck" and so forth...

Anyway, great work! Next steps?

I've only got the 1 engine. And it's only PS2. If you want to try to it with the PS3 file, I'm game -- but I suggest we start with the same sound file.

Do you have a candidate?

(or....?)


----------



## Severn

Here's my contribution to the software in this area. It's not nearly what Mark has done, but it can run wherever you can install Java. (Java 8)

Give it a .mth file and it will print out the clip start/end addresses it finds, length and some other info similar to the table in Mark's player.

Note: this doesn't play a thing, it just prints info. Feel free to expand!

The usual caveats: tested on 1 .mth proto2 sound file only, and only spot checked against Mark's table using the same file -- so not exhaustive ...

package YOURPACKAGENAMEHERE;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;

public class MTHSoundFileParser {
private String pathToFoundFile;
private Path pathToSoundFile;
private byte[] rawSoundBuf;
private ByteBuffer sbb;

public MTHSoundFileParser(String pathToSoundFile) {
this.pathToFoundFile = pathToSoundFile;
}
public void importFile() throws IOException {
openSoundFile();
readSoundFileData();
wrapByteBuffer();
}

public void parseRecords() {
// according the original thread in the forum, the clip records
// start at offset 0x0100 and that is the case...
//

int samplesAddressStart = 0x8000;
int startAddr = 0x0100;
int endAddr = 0x10f0; // 0x10e0;
int frameSize = 32*1024;

int clipTableCount = (endAddr - startAddr) / 16;

sbb.position(startAddr);

sbb.order(ByteOrder.BIG_ENDIAN); // this is the default in java machine

int clipsFoundCount = 1;

for (int i = 0; i < clipTableCount; i++) {

// ByteBuffer has several methods to get various sized
// things. Due to the fact that everything in java is signed
// casting to larger data type and masking must be done to avoid sign issues
//

// it starts with up two 3 "start addresses"
int[] frameStart = new int[3];
int[] offsetSampleCountStartHi = new int[3];
int[] offsetSampleCountStartLo = new int[3];

for (int s = 0; s < 3; s++) {
// 3 byte groups: frame, sample count start hi/lo byte
frameStart = (int)sbb.get() & 0x000000ff;

// addr = 0, 1, 2
offsetSampleCountStartHi = (int)sbb.get() & 0x0ff;
offsetSampleCountStartLo = (int)sbb.get() & 0x0ff;

}


// Then unknowns, usually 00s, sometimes not if the last byte ends in F
int[] unknowns = new int[3]; 
unknowns[0] = (int)sbb.get() & 0x0ff;
unknowns[1] = (int)sbb.get() & 0x0ff;
unknowns[2] = (int)sbb.get() & 0x0ff;

// Then the frame end
int frameEnd = (int)sbb.get() & 0x000000ff;

int offsetSampleCountEndHi = (int)sbb.get() & 0x0ff;
int offsetSampleCountEndLo = (int)sbb.get() & 0x0ff;

// Finally the mysterious last byte, with data it
// seems to be 0B, 1B, 2B, 4B, 5B, or ends in F, like 1F
//
int endByte = (int)sbb.get() & 0x000000ff; 

// just skip any 16-byte groups that start with 0xff
// note we got allow the ByteBuffer gets() to complete to here to keep
// it's internal ptr in the right place
// so, a little extra work
//
if (frameStart[0] == 255) continue; // this will skip the entire unused "no clip" rows

// now calculate up to 3 sample start addresses, and one end address
int[] sampleStart = new int[3];

for (int ss = 0; ss < 3; ss++) {
if (frameStart[ss] == 255) {
// not used... really what we care about here are 2nd and 3rd start 
// addresses which sometimes appear... as we've skipped any that start
// with 0xff already
//
sampleStart[ss] = -1;
continue;
} else {
sampleStart[ss] = ((frameStart[ss]-1) * frameSize) + (((offsetSampleCountStartHi[ss]*256) + offsetSampleCountStartLo[ss])/2) + samplesAddressStart;
}
}
int sampleEnd = ((frameEnd-1) * frameSize) + (((offsetSampleCountEndHi*256) + + offsetSampleCountEndLo)/2) + samplesAddressStart;


// print the info out in a reasonable way
System.out.println("====> Clip " + clipsFoundCount++);


List<Integer> addresses = new ArrayList<Integer>(); // just to calc. clip length...

for (int sss = 0; sss < 3; sss++) {
if (sampleStart[sss] == -1) continue;
System.out.println("Start(" + sss + ") address ==> " + String.format("0x%08x", sampleStart[sss]));

// valid addresses go here
addresses.add(sampleStart[sss]);
}

System.out.println("End address =======> " + String.format("0x%08x", sampleEnd));

addresses.add(sampleEnd);

System.out.println("Mystery bytes =====> " + String.format("0x%02x",unknowns[0]) + " " + String.format("0x%02x",unknowns[1]) + " " + String.format("0x%02x",unknowns[2]));

System.out.println("Last byte =========> " + String.format("0x%02x",endByte));

System.out.println("Clip length =======> " + clipLengthFromAddress(addresses));

System.out.println();
}

}
private Integer clipLengthFromAddress(List<Integer> addresses) {
// FIXME, ugly
Integer length = null;
if (addresses.size() == 2) { 
// there can't be less than 2, and i don't check for it
int len = (addresses.get(1) - addresses.get(0)) + 1;
length = Integer.valueOf(len);
} else if (addresses.size() == 3) { 
int len = addresses.get(1) - addresses.get(0);
len += (addresses.get(2) - addresses.get(1)) + 1;
length = Integer.valueOf(len);
} else {
// and there can't be more than 4, and i don't check for it
int len = addresses.get(1) - addresses.get(0);
len += (addresses.get(2) - addresses.get(1));
len += (addresses.get(3) - addresses.get(2)) + 1;
length = Integer.valueOf(len);
}
return length; // if null appears in the output, there's a bug
}
private void openSoundFile() {
pathToSoundFile = Paths.get(pathToFoundFile);
}
private void readSoundFileData() throws IOException {
rawSoundBuf = Files.readAllBytes(pathToSoundFile);
}
private void wrapByteBuffer() {
sbb = ByteBuffer.wrap(rawSoundBuf);
}

@Deprecated
private void grabFirstClip(String filePath) throws IOException {
//rewind the byte block
sbb.rewind();
sbb.position(0x8000); // start of the first clip
byte[] clipBuf = new byte[55179]; // try the values in the 1st, 2nd record (failed: this is by trial an error for my file & may be off by a sample or two)
sbb.get(clipBuf);

Path clipFile = Paths.get(filePath);

Files.write(clipFile, clipBuf); // create, truncate, etc..
}
public static void main(String args[]) {
if (args.length != 1) {
System.err.println("Need .mth sound file to proceed.");
System.exit(1);
}
try {

MTHSoundFileParser sfp = new MTHSoundFileParser(args[0]);
sfp.importFile();
sfp.parseRecords();
//sfp.grabFirstClip(args[1]);

} catch(Exception e) {
e.printStackTrace();
}
}
}


----------



## Mark DiVecchio

A followup comment on transferring sound files --

Since I can see where the hotkey codes are in the sound file. I edited a sound file to add a new hotkey. I changed one byte from 0x00 to 0x01 which should add the Beacon Light hotkey. I loaded the edited sound file into the engine, added the engine to a remote and the new hotkey showed up in the list. I did not get any error messages from the loading. 

This is only one data point, but at least this area of the sound file is not covered by any precalculated CRC.


----------



## Severn

Wow! That's a great result!

If it were true, I've really like to replace the bird tweet/barking sound clip which apparently comes with the engine -- which I find to be annoying.

I would also like to add sounds, and it may seem childish but my children would really like to hear Thomas engine(s) sounds coming out of it -- even if the engine I have is a diesel.

If that could be made to work I'd be motivated to buy the closest MTH steamer to Thomas I can find -- paint it blue and possibly stick a face on it of my own design.

And add "Cinder's and Ashes" or any of his other catch phrases to it... so I can play them back through the engine at any time besides using the remote (well actually RTC but you get the idea) -- or as part of "crew talk".


----------



## gunrunnerjohn

That's cool Mark, I really wondered if the CRC would be the stumbling block.


----------



## Mark DiVecchio

I now understand what the other columns in the index are for.

As I was listening to clips, I realized that some clips are repeated when more continuous' sounds are played by the engine. For example, the background 'rev level' sounds.

I saw that every time there was what I considered a repeatable sound, the columns that I labeled (1), (2) and "Unknown" had values.

I saw that the values in the (1) and (2) columns were always in the range of the overall clip start and end addresses so I figured that they must be addresses. The (1) is always higher than (2) so I figured (2) was a start address and (1) was an end address. I modified my program to play these sub-clips but nothing came out of the speaker. I turned the volume way up and I could just hear what I thought were the correct sounds for the sub-clip.

When I learned about converting the 4 bit ADPCM to 16 bit PCM, I found code for a state machine to do this. That is what is in my program. That state machine requires a starting state and starting sound value. So for a sub-clip that starts in the middle of a longer clip, just maybe the "unknown" column was that initial state information. (If the state machine starts in the wrong state, it will be in the wrong state forever. So we need initial values.)

I just took a wild guess and set the initial state to first byte and initial sound value to the second/third bytes. And the sub-clip played.

The updated ADPCM program v 1.0.5 and source code are on my usual web page.


----------



## Severn

Oh that is good sleuthing! I was thinking something similar in that somehow those designated multi-part clips. Where on my engine a few sounds seem linked together but are separate clips... ("Check the sand", "The sand is topped off!") but I didn't get as far as actually trying to figure it out what these entries mean! And they don't really mean that as I understand your post ... they are for repeating looping sounds.


----------



## gunrunnerjohn

Now if we could just document all of this.


----------



## Severn

So for example, this clip address and it's corresponding values:

0x0360(a), 0x1EA82(s), 0x1F2E4(1), 0x16FFB4(2), 0x1FD62(u), 4833(l), 0x0F(end)

The #1 and #2 are within the total span of "s" and are a "repeating sub-clip" -- "u" is the initial start value for the ADPCM codes starting at #1?


----------



## Severn

here's another, a variation I believe:


0x03D0(a), 0x544C9(s), 0x45AED(1), 0x544C9(2), 0x00000(u), 0x45AED(e), 9765(l), 0x1F(d)

To me it looks like it says: the "subclip" is the entire clip in reverse and the "start value" is 0?


----------



## Mark DiVecchio

I'll use an example from my P&LE SW1500 #1561:

Here are the values from left to right in the index entry at 0x0170:

0x00019E3F starting address of full clip
0x0001B995 ending address of repeatable segment
0x0001ABCB starting address of repeatable segment
0x3802FA initial values for state machine when playing repeatable segment:
index=0x38, previousValue=0x02FA (look at my source code)
(This value appears to be zero for clips without a repeatable segment)
ox0001BF7F ending address of full clip


----------



## Severn

I think I get it.

Mine examples are from the file p071pf3_sd70ace_up070604afin.mth. 

Which I neglected to share... it's a ps2 diesel with a "turbine"... as I understand it.

I copied the values from a rows in the ADPCM player.

I would like to know if you believe as I do that this one means the sub-clip is the reverse of the entire clip:

Key: a = address column, s = start column, 1 = 1 column, 2 = 2 column, u = unknown column, e = end column, l = length column and d = data column:

0x03D0(a), 0x544C9(s), 0x45AED(1), 0x544C9(2), 0x00000(u), 0x45AED(e), 9765(l), 0x1F(d)


----------



## Mark DiVecchio

In your case, it appears that the entire clip is repeatable. It's still played in the normal direction. The initial values for the state machine are zero because the full clip and the sub-clip are the same.


----------



## Severn

I'm confused by that because I'd have thought (1) and (2) would be played in that order... no?

In other words (s)==(2) here and (e)==(1), the first run through went from s-->e, then the "sub-clip" went from 1-->2...

?


----------



## Mark DiVecchio

Well, that is what one would think but for some reason, it's the other way around. If you consider the values of the addresses, the Start address must be lower in memory than the End address.


----------



## Severn

Not trying to argue, I'd just imagined it was an address ptr decrement operation after detecting the first address being larger than the second. But if it sounds correct the other way, then that's got to be right. Seems a little odd to capture it that way but it is what it is...


----------



## L0stS0ul

This is awesome work. It would be awesome to mix and match sound files to create one you like. I'm just starting to read up on what you all have done and hope to start playing with it too. It's amazing how many sounds and announcements are inside these files that I have never heard come from the engine. Seems like the one I'm looking at has a combination of passenger station and freight sounds inside it. There must be a bit somewhere that sets which sound set the engine is configured with.


----------



## Severn

It seems like many folks would have the interest. If for example you had several different numbered engines of the same type -- you might have the interest in differentiating them a little by sound.


----------



## Mark DiVecchio

I've experimented with editing the sound file. I was able to change sounds and load the new sound file into the engine and it played.

The first thing I tried was to copy an index entry from a clip into the first index entry which appears to be the startup sound. Downloaded the edited sound file into the engine without error and the new sound played on startup.

Then I copied an actual sound clip over top of the shutdown sound clip. In the index, I had to adjust the end address to match the copied clip. Downloaded and the copied sound played on shutdown.

So this is possible. There is apparently no CRC in the sound file itself.

Someone could write a program that makes it easier to edit the sound file. You could use Audacity (or some other audio capture program) to get new sounds.

We still don't understand how we indicate the sample rate of the sound file. It appears that PS2 engines can play at 5,512 and 11,025 Hz. PS3 engines can additionally play 22,050 Hz. It appears PS2 engines play VOX/IMA ADPCM and PS3 engines additionally play 24 bit signed PCM.


----------



## L0stS0ul

That is awesome progress !! nice work. I'm more of a UI programmer and I can help with the frontend stuff. I have access to the full visual studio and installer tools. If you could create a version of your program that was command line based and took in parameter files I could create a ui for it. Even allow people to preview the file. I saw that the version posted on the site is borland builder and I don't have access to that. Maybe we could create a github repo and start collaborating. 

I've never been good dealing with binary files and I looked at what you have done. I may not be able to help on that end but I can help on the UI part. I would love to.


----------



## Mark DiVecchio

Severn (on this list) has ported a slightly older version of RTC to the latest Embarcadero C++ 10.1 (this is the 10th generation of Borland C++ Builder - I used v4.0). Amazingly his port was successful.

We have a mailing list at:

https://groups.google.com/forum/#!forum/remote-train-control

People who have an interest can request to join in the technical discussions.


----------



## Severn

Well back now a few weeks, I had the idea of trying QT with the gnu tool chain -- porting it to that. 

I did some of RTC page's in QT "creator" but quickly realized the large effort involved when I then tried to import the code into another tool chain I have... 

Then we got to talking about moving it to the more recent Embarcardero (Borland) Builder 10 suite ... ("B10")

And somehow by dribs and drabs, well that is what is happening.

I think showing that it is buildable/runnable on my computer, it is now more realistic to me at least to port it to something else... 

But the basic B10 tool is cheap ($45 on sale!) The add on GUI components are also pretty cheap... ($80)

The result works ... 

Would Mark rather have it as QT on the gnu tool chain? Or Java, or Python or... just stick with B10?

(Well I guess a lot just depends on your interests really too...)


----------



## Severn

On the sound file.. that's amazing news!! But I have been suspicious it would work for a while. 

I believe looking at my PS2 sound file now many times, and it being an older engine -- that the number of real clips there isn't that numerous. Instead the bulk of the sound file is taken up with the "turbine sound" (it's one of those SD70ace things)... 

And as there's only so much flash on the PS2 board, there's really no other room to steal so to speak to literally add clips.

But I can replace some maybe...

There is definitely one strange clip on it which occurs at the end of the file - it sounds like birds tweeting and there's a dog barking.

Given its location -- I have thought this is a fill clip, to fill the file out to the end of flash memory... that simple.

(or maybe there's some unknown but important meaning to this "birds and barking" sound clip they've included... although this is lost on me)

Given that, I figured the simplest and easiest thing to would be to replace it with something else... byte for byte.

But I can't imagine every engine's sound file has this "extra" fill clip in it -- a throw away clip just waiting to be replaced. Still this would be the easiest way to add a new sound, well replace one... you clearly didn't care for at all.

After that I might be tempted to go after some of the dialog clips, that is replace them byte for byte.. some are to me at least of poor quality. So I would not miss some of them. And it might be fun to try to do better or find better. ("Ground control to major thom.." Ok maybe not that.)

Then there's a fair number of shop/"rail" sounds which mostly sound like pressurized air being released... or are completely unidentifiable. I'm a little underwhelmed by (some of) these too. And would not miss them either.

Maybe again they could be replaced byte for byte... (although with what exactly I'm not sure...)

After that it's horns and engine sounds, couplers, clickety/clack and a bell. I'd probably leave these alone completely. 

And oddly I'm a bit partial to the train wreck sound which as "fakey" (to me at least) as it sounds, always makes me giggle...

So that's another I'd not want to touch ...

Still all in all just replacing up to say a half dozen clips would be be fun I think... and whether while it isn't clear I'd truly prefer the result ... if not, I can always go back the the original...

(So in way, it's a nice feature MTH almost supports then!)


----------



## gunrunnerjohn

With all the free and excellent C/C++ platforms available, I'd like to see this running on something that doesn't cost an arm and a leg.


----------



## Mark DiVecchio

Well, the RTC program is free and runs on XP, Win7, Win8 and Win10. You can download it from my web page. The only money involved is buying the radio (about $40 plus shipping from Spain).

If you want to actually edit and compile the program itself, the source code is also available on my web page. It would great if someone could port it to an open source compiler.


----------



## Severn

Not trying to be argumentative but ... the Embarcadero starter C++ Builder 10 is $49. The Abakus add on components are about $80. 

That's 1/4 of an MTH engine... not so bad!

To me a lot depends on whether Mark thinks a port is worth it... otherwise it might happen but be an unsupported "one off".

Functionality and the ability to run on more platforms is another possible facet to consider.

I don't mind speculating...

Right now as I understand it B10 is supported on many platforms but this gets into an expense. Such is no longer $49.

QT (which to me is married to gnu c++ but maybe not) claims multiple platform support including things like android. I believe windows and linux. Perhaps the apple port is 'ok'. I'm a little skeptical about the hand helds.

QT is free-ish. There's a free version but there are not free versions too. It may well be that you pay to get the other kinds of platform support.

Android's native interface is Java-ish. It's their flavor of Java. I don't know as much about ipad and variants -- i believe there are multiple options but not java.

Personally I'd probably prefer Java from a pure language point of view. I like it for one, it's cleaner looking and simpler than C++ (but more limited in many aspects). And it runs on a lot of platforms too.

Reading off the USB port might be an issue with it though. I found one support library for such but I don't know how well it works or if it works beyond windows with doing something yourself the lowest level interface code. 

Java deployment is something of a pain, because the user has to install Java first -- not just an ".exe". (at one time you could get .exe wrappers, not sure where that is these days and that doesn't cover apple, linux or the others at all)

Bit twiddling and "byte busting" is not so ideal in Java but workable -- RTC needs some of this. Java and real-time apps don't quite go together. But for slow-speed not so critical "real time" -- maybe not so bad. RTC might fall into this category.

An Android version is still an android port in my mind even though its java-like. it's native gui is a complete change from that offered by java itself. (i've done a dollop of it) I'm skeptical of going outside its native development platform, although I've read in brief quite some time back this is possible. My instinct though would be to start native and get the basics working... then if one felt strongly to try QT or something else. Someone with previous experience might be able to "just do it" however.

I know almost nothing about ipads, but suspect it's a complete change to a different language and gui interface too.

These days in my neck of the woods you get a fair number of folks who push Python, and in a few case Ruby over these more traditional approaches.

I just mention them ... my experience with either is limited.


----------



## Severn

I see in the previous post Mark supports the notion of a port. I'm happy to try a port of the AMPCS to QT/gnu c++ tool chain. (the clip player)

The AMPCS is much smaller than RTC and there's only one form (gui page) to get right.

Depending on level of difficulty, etc... that might shed some light on an RTC effort.

And if it's abandoned I won't feel like it was a tremendous waste of time.


----------



## L0stS0ul

Visual Studio Community edition is free and allows for rapid windows ui programming. I have the pro version for work but community has everything needed for this project. Shouldn't be too hard to rip out the ui elements in the borland project and get it working in visual studio. I'll probably take a crack at it just for my own amusement. I don't see any reason for something like this to ever run on a mobile platform.


----------



## Severn

One thing B10 doesn't have (that I can tell) is any layout managers for the gui widgets/components -- in other words it's basically absolute x/y addressing for them (although the details of it are hidden from you in the gui editor part of the tool)... and so where you drag them over to the form where the components are, and their lengths, widths and all that are fixed as they were created.

There's no stretchiness to them or any kind of way (that I can tell) or resize the windows this way or that ... (mark is now going tell that i'm wrong and one can do all this by ____ )

So yeah, it'd be nice to see what the windows gui stuff can do. I have no familiarity or limited.

I did use the vs community addition some years ago... but I think that was just to get something compiled and working from somewhere else, then ported to the gnu stuff...

(or something along those lines)

And we didn't do windows guis at all. If anything we tended to use Java if there was a need for a gui, even if it was just a thin veneer and called the real c-code sort of thing vs a pure native java app.


----------



## L0stS0ul

Microsoft has created some of the best UI tools out there. I do lots of java (and other languages) as well and nothing comes close. I really don't like programming java for ui's other than android. 

I'm going to play around with creating a native windows library of the current functionality so I can start playing with a UI in C# as I believe that is the fastest way to a ui on windows. Once the core functionality is in a lib it's easy to use it other places


----------



## Severn

I think you are right but it's just folks don't support windows only development around here, so I've no experience at all.

Java swing was always pretty blah blah, and the new javafx is better looking and modernized but still -- it's a money issue, how much is anyone really investing in it...

it's a step down then, i think this is true since m'soft is investing completely in windows.

ok, forgetting about all that -- so what about this idea then.

Define the commands (which are just ascii codes now) into some kind of abstraction... then then abstraction can be implemented in a variety of platforms, languages, etc...

their "instances" would be represented ultimately as the actual char codes going over to the tui...

Just a vaguish idea..


----------



## L0stS0ul

I think I am missing something here. I see the RTC project and it's a huge project for sure. I thought this thread was just looking at creating something to mess with the audio files but it sounds like you guys are looking to integrate all of this in with the RTC project. 

I'm up for creating a ui for hacking the audio file to be uploaded by the TIU. I wasn't aware of the larger project that already has significant investment in it.


----------



## Severn

Ah... i'm getting my "threads" mixed up. Sure, the RTC is one thing, the audio clip project is another. I don't have any idea if Mark plans to mix them together.

So, anyway forget my previous comment on the topic, it was all about command representation in regards to the RTC -- nothing at all to do with AMPCS program (clip player).


----------



## gunrunnerjohn

I downloaded Mark's program, pretty neat to be able to play the sequences. 

One question...

How does one figure out the mapping of the sounds to the functions on the remote. Somewhere, there is a mapping of the whistle, bell, etc. to the actual sound clip. Has anyone figured out that linkage?


----------



## Severn

Mark can answer better than I can... maybe he has some other findings... I'd like to be contradicted here.

But I think so far there's no indication there's a "master table" in that way.

There are other non-sound data areas in the .mth file but it's not obvious as to their function -- maybe that's what one or more of them are somehow.

One theory is that unfortunately the sound-clips like the bell, horn, etc.. that are the same command train to train are hard coded in terms of their location in the sound file "index". (the table at 0x0100 -- so for example the bell is always at offset "10", etc...)

One way to help determine this would be to compare a bunch of .mth files in to see if they line up in this manner.

I've yet to do anything like that. I've just looked at 2 files and have but the 1 engine which is ps2... but it would be worth a look.


----------



## L0stS0ul

Yeah, I've looked over a few different files now and it does look like the mapping could be static. If every file has the same layout then there is likely no linkage within the file. It would also explain why my "passenger" engine has duplicate entries that also map to "freight" sounds that are not used. That could also explain why there are lots of empty sections in each file. Those could be empty for sounds used by other engines with more features. Just a thought. 

Somewhere the gear ratios and stuff need to be defined as well for the speed controller. I'm pretty sure these also manage that right? I would think all of that would be in the header and the sounds would all follow later in the file.


----------



## Severn

Mark also thinks the "hotkeys" are at 0x1900, for example in another email he said:

"Since I can see where the hotkey codes are in the sound file. I edited a sound file to add a new hotkey. I changed one byte from 0x00 to 0x01 which should add the Beacon Light hotkey. I loaded the edited sound file into the engine, added the engine to a remote and the new hotkey showed up in the list. I did not get any error messages from the loading."

He also thinks the high-bit on the "data" upper nibble in the sound file table (right most column in the table in adpcm player) indicates whether the clip is PCM or ADPCM. (e.g. 0x80). 

I think the 3rd bit in the low-nibble indicates whether there's sub-clip, (e.g. 0x04)... although it seems there may be other ways to tell this as well.

We've talked about the data byte indicating replay rate but isn't clear that it does. 

I also have 1 data byte with the value of zero -- there's a clip, to mean it's "deleted" or not played... even though it's there. I have never heard it so maybe. (it seems to be missing the "response" clip, it's a dialogue.)

That's another one of interest, some clips seem tied together. Does the table say this somehow... 

etc...


----------



## gunrunnerjohn

There has to be some indication of where the hotkeys are mapped as you can change their order. That may be buried in some RAM and not the FLASH, which would make more sense since you can change them on the fly.


----------



## Severn

Is anyone familiar with the engine catalog numbering scheme that MTH uses and know its even approximate range for the o-gauge their overall stock catalog (example MTH 20-2200-1 is an engine on their website support page. But MTH 20-2700-01 is not -- so these #s, their possible valid range from engines 19xx-today).


----------



## Mark DiVecchio

Version 1.1.0 of ADPCM is available on my web page 

http://www.silogic.com/trains/ADPCM.html

Generally works better and can generate mp3 files from each sound clip.


----------



## L0stS0ul

I would love to know more specifics of how you edited one of the sounds as I'd like to give it a try. I'd like to take the horn from one engine and update another engine file with it. Any tips on software you used to actually remove the old mp3 and add the new mp3 into file would be helpful. I already have the 2 files I want to swap horns.


----------



## Mark DiVecchio

Everything that I know about the format of the sound file is encapsulated into the ADPCM program. Replacing a clip in the sound file is technically simple, but the complexity lies in the need to write a computer program to do it.

You would have to write a program to read in the sound file, parse the index, convert the mp3 file into 4 bit ADPCM, find enough space in the sound file to add the new sound, move the sound into place and then rewrite the sound file on disk. Then, of course, upload it into the engine.

This is simple if you are a programmer and can understand my C++ ADPCM program.


----------



## gunrunnerjohn

But Mark, you already understand the program and the process, so why not just go ahead and write the application?


----------



## Mark DiVecchio

There are a lot of better programmers out there than me.

I had actually done part of this manually several months ago. Using a program called HxD (https://www.mh-nexus.de/en/hxd/) I was able to cut a clip from one sound file and stuff it into another. It was an ugly manual process but it showed that the scheme could work and that the resultant sound file could be loaded into an engine.


----------



## Mark DiVecchio

I spent some time thinking about ways to change the sound file. The simplest way is to replace a clip in one sound file with a clip from another sound file. This does not require any sound format changes.

I added a routine in ADPCM that lets you do that. I have a beta version which I'd like to have some help testing. 

You can read about it here:

http://www.silogic.com/trains/ADPCM.html#editing

Mark


----------



## L0stS0ul

That is very cool! I'll give it a try tomorrow and let you know how it goes


----------



## gunrunnerjohn

Nice job, that will give people something to do flipping the sounds in their MTH locomotives. It is cool to be able to toss in a couple of custom sounds!


----------



## L0stS0ul

I have successfully edited a sound file. I replaced all the whistles from 152-156 as well as the forward and backward ones at 43 and 44. Uploading it into my engine now to see the output. File size ended up being exactly the same as it was before the edit. 

I've found that going backward and starting at the highest clip seems to be more reliable when editing. When going forward I had several clips get cutoff. Maybe it was a fluke. Definitely check the sounds after editing. 

So very cool and very powerful to have.


----------



## Mark DiVecchio

Right now, I don't shorten the file length. I'll have to think about doing that. Even though the sound file is not shortened, the space freed by using a shorter clip is available for other
longer clips.


----------



## gunrunnerjohn

Mark, I don't see a need to shorten the file length. Who cares if it's a bit longer?


----------



## Mark DiVecchio

*ADPCM v1.2.0 Sound File Editor*

Version 1.2.0 of my sound file editing program, ADPCM, is available on my web page: 

http://www.silogic.com/trains/ADPCM.html

This version lets you

1. play the clips in a sound file

2. copy a clip from one sound file to another

3. insert your own created clip into a sound file

4. edit the sound file clip index

With (3), you can insert any 4 bit IMA ADPCM clip into a PS2 or PS3 sound file and you can insert any 24 bit big endian signed PCM clip into a PS3 sound file. You can create those types of clips using freely available tools (I used Audacity and Sox - links to these on my web page).


----------



## Mark DiVecchio

*Quilling Whistles and Crossing Sound*

Using my program ADPCM, which originally was written to play sound file clips on your PC, I have been able to edit the
sounds in that file, download them to an engine and have the engine play them.

I've created 4 new videos:

1. ADPCM Video 3 - Demo of the various whistle clips in a sound file - this explores the different whistles/horns.

2. ADPCM Video 4 - Crossing Sound SXS added to a PS2 engine that did not originally have one

3. ADPCM Video 5 - Proto (or Quilling) Whistle SPW added to a PS3 engine that did not originally have one

4. ADPCM Video 6 - In my train room - operating the Proto (or Quilling) whistle SPW with my RTC program

You can watch these videos on my web page at:

http://www.silogic.com/trains/ADPCM.html


----------



## L0stS0ul

Amazing work! I'll download and give the latest a try


----------

