What about using the MERG RFID sensor system to trigger one of a series of sound clips that fade in volume on approach and recede.
With the sounds held on an SD card and triggered by the RFID tag to play the sounds starting at a point 'in the distance'. The RFID tag, fitted to the leading vehicle, and linked to a sound file for the train consist on the sound file player would trigger the appropriate sound for that consist.
A speaker located at an appropriate position would then play the sounds.
An Arduino, Raspberry pi or a dedicated sound player such as Pricom's Dream Player Pro could be the basis of the sound player.
IIRC something like the Dream Player has 16 channel polyphonic stereo sound capability, far more than any sound decoder - the sound file memory is limited only by the SD card, there's no motor control issues, no cams to fit, fewer decoders needed, less cost and the sound emanates from static point source(s) and is heard at an appropriate volume level from position(s) only when the trains pass those position(s). To quote a phrase 'just Ike the real thing'!
But, you do realise this will now extend the 'scale colour' debate to 'scale sound' and no doubt we will all need to take ear defenders to exhibitions if this concept catches on