DRM, Digital Radio Mondiale, is a form of digital radio. Transmissions are relatively scarce here in Western Europe, but is seems that elsewhere there is an increasing interest.
For DRM signals there are different Modes and Spectra, here in Western Europe (few) transmissions can be received in Mode B, with a spectrum of 10 KHz. Other spectra are 4.5, 5, 9, 18 and 20 KHz. Different Modes are defined such that they perform optimal with given fading patterns.
Other than AM signals, the spectrum of a DRM signal is not centered around a clear carrier, the spectrum looks more or less like a "block", as seen on the picture.
The picture shows the spectrum and the waterfall of a transmission of Radio Kuwait. The transmission is Mode B, spectrum 3, i.e. a width of 10 KHz.
The DRM basics are defined in terms of a sequences of 12000 (complex) samples per second. Each complex sample then tells the amplitude and the phase of the signal at the sampling time.
The DRM signal is then defined in terms of symbols or words, symbols are then grouped into frames. Modes differ in the structure of these symbols. The most common mode, mode B has words of 320 samples. The prefix length is 64 samples, these prefix samples are a copy of the last 64 samples of the symbol. In other modes, the wordlength and the prefix length may differ. (Such a prefix, a copy of the last samples of the word, is very helpful in determining both the first sample of a word as well computing the frequency offset of the signal.)
The 256 "useful" samples of the word in Mode B, i.e. the samples in the word, not counting the prefix, are - using an FFT processor - mapped onto values (carriers) in the frequency domain.
These "words" with frequency domain values (carriers) are then the basis for defining the content. As always, not all carriers carry useful information, for Mode B and Spectrim 3, only 206 of the 256 carriers contribute to the content of the transmission, the remaining ones, 25 on each end of the spectrum, are (supposed to be) zeros.
A group of these "words" forms a frame. For Mode B, such a frame is built up from 15 such words. The duration of a "word" is 26,66 msec, the carrier spacing is 46.875 Hz, the duration of a frame is then 400 msec.
Such as frame then has 3840 cells with complex values, with 3090 cells with useful values. Note that in the time domain the frame takes 4800 samples (the prefixes), and with a samplerate of 12000 samples a second, there are less than 3 frames a second.
In the frequency domain frame, predefined locations in different words have predefined values, so-called pilot values. Matching these pilot values with the values that are actually found can be used to reconstruct the signal as it was at the transmitter side. DRM requires coherent decoding. Data in DRM is encoded as QAM4, QAM16 and/or QAM64. it is up to the transmitter to reconstruct the values as accurate as possible as they were at the transmitter side. (Note that a QAM4 value encoded 2 bits as a single complex value, QAM16 encodes 4 bits as a single complex value and QAM64 encodes 6 bits as a single complex value. Of course, both phase and amplitude are important then).
The picture shows the constellation of an ideal QAM16 signal, the (reconstructed) signal value should match one of these 16 positions as close as possible to extract the 4 bits. Obviously, for QAM64, the constellation defines 64 different values.
A frame in DRM contains three types of values, stored in predefined locations in the DRM frames.
FAC, Fast Access Channel, values, containing QAM4 encoded bits. The FAC presents some general information, such as the mode and the spectrum, the number of services, the way the SDC and MSC are encoded etc.
SDC, Service Description Channel, containing detailed information on the actual payload, the audio and data services that are carried in the transmission. The bits for the SDC content are usually encoded in QAM16 (in DRM+ they are encoded QAM4)
MSC, Master Service Channel, containing segments with the audio and data services. The bits are often encoded in QAM64, sometimes in QAM16.
A group of three subsequent frames forms a so-called super frame, The FAC in the frame will tell whether or not the frame is the first or the last one in the superframe. Only the first frame will contain the SDC bits. A superframe contains data, good for 1.2 seconds of audio. Audio is encoded in an HE-AAC variant.
What must be realized is that all data in FAC, SDC and MSC is protected against errors by using viterbi encoding, while each individual segment carries a CRC checksum.
The decoder gets as input a sample stream, in our case the sample stream has a rate of 12000 complex samples a second.
Of course, the first question that needs to be solved is what mode and spectrum occupancy the transmission has and where do the "words" start.
The Mode can be determined by correlation, the length of the prefix, mentioned before, is different for each mode. Mode detection is then simply by correlating the prefix with the supposed end of word and finding the best match. In order to get a more reliable result, the process is done over a range of words rather than just a single one. Experience shows that correlating over 15 to 20 lengths of words gives a reasonable result.
Of course, by detecting the mode, the first sample of a word is detected simultaneously. A minor issue, though not unimportant is that there might be a clock offset in the decoder compared to the clock in the transmitter. This might imply that after the transmitter has sent a number samples, the receiver may be one or two samples off. So, continuous monitoring that the time synchonization is still OK is a must.
The type of the spectrum - defining the width - can be deduced from the words in the fft output. Looking at the strength of the carriers in the FFT output, and noticing a strength difference is the way to go. (Note that while the FFT was done over 256 elements (using mode B), the outermost elements have (should have) an almost zero amplitude, for mode B only the central 206 out of the 256 carriers contain useful information.)
Happily, some values at predefined locations in the spectrum have larger, known, amplitudes and given phases. By comparing the spectrum as computed with the spectrum as it should be, the decoder can detect the frequency offset,
By comparing the phases of the sample values in the prefix and the corresponding sample values at the end of the time domain words. a more precise error can be computed and corrected.
Of course, if the frequency offset is larger than the carrier distance (as said, for Mode B app 46 Hz), the carriers in the spectrum contain the wrong values and decoding is impossible. Conversely, if decoding is possible, then obviously, the frequency correction is more or less OK.
So, the decoder has to deal with
time synchronization to continuously ensure that the first sample of a word in the time domain is found;
frequency synchronization, first of all to ensure that the frequency offset is less than (half of) the carrier distance (the "coarse" offset), secondly that small offsets within the carrier are also noted and dealt with.
extracting the FAC bits and decoding of the FAC. Since the FAC is encoded QAM4 it is easy if the frequency correction is OK. Note that if time and or frequency correction are not done well, decoding of the FAC is impossible.
Extracting the SDC bits and decoding of the SDC. Since SDC is usually encoded in QAM16, decoding it is slightly more complex than decoding the FAC. As a rule of thumb, if FAC decoding is erroneous, SDC decoding will fail as well.
extracting the MSC bits and from them the bits for the selected service, and decoding these bits, usually AAC encoded audio.
The decoder contains labels to show the correctness of the time synchronization and the FAC, the SDC and the AAC decoding.
The picture of the decoder gives quite some information about the received and decoded signal.
at the top left, the measured frequency offset is shown. Two numbers, one for the so-called coarse offset, i.e. the number of carriers off (here -272 Hz), the second number telling the fine offset, i.e. the offset in the carrier (here -6.98 Hz). Since FAC decoding is possible, the coarse frequency offset is compensated for, and the fine offset is sufficiently well handled as well.
12000 mono is the indication of the rate and type of the audio output
the numbers 180 2 give an indication of the time offset. Here it states that after 180 "words" (each 320 samples), a correction was needed with at most two samples, i.e. the next word should start 2 samples later.
the number 0.982 tells that from the last 1000 audio frames, 982 could be correctly translated into sound.
the labels time sync etc speak for themselves
Below there is a row with 3, C, QAM64, AAC, which tells
that the spectrum is of type 3 (which just means a 10 KHz spectrum),
that the mode is Mode C,
that the bits carrying the audio are encoded as QAM 16,
and the audio itself is encoded as AAC (the other option is xHE-AAC)
Below this, there are 3 numbers, giving some information on the quality of the signal. The values are derived from the deviation of the values as measured to the values as they should be. In general: higher is better.
the number 24.08 tells something about the quality of the signal elements in which the FAC is encoded. The spectrum cells assigned to FAC contain values encoded as QAM4, a value of 24 tells that the signal is good..
the number 5.428 tells something about the quality of the signal elements in which the SDC is encoded. The spectrum cells assigned to SDC contain values encoded as QAM16, a values 5 tells that the signal is not very good,
the number 5.984 tells that the quality of the signal in which the audio data is encoded is good (MSC data is encoded as QAM16); The signal is reasonable.
The next line has two elements, the name of the service, again Voice of Nigeria and a box with text sent with the transmission.
The scopes at the bottom give some visual information about the signal.
The scope left shows the corrections to be applied to the incoming data in the frequency domain: the red line indicates the correction to the amplitude, the blue line the correction to8 the phase. It shows that the signal here is reasonable.
The scope right gives the constellation of the restored signal. The 16 dots of the QAM16 signal are clearly shown.
For the drm-receiver both an AppImage (for Linux-x64) and a Windows installer are available on the releases section of this repository. AppImage and Installer are compiled with SDRplay (2.13), DABsticks and Hackrf devices configured.
For Windows, SDRuno (see above) is a software radio package, developed for all versions of the SDRplay RSP's. To support DRM decoding a plugin is developed and available.
The plugin - and the sources - can be found here .
On shortwave there is more to do than just DRM. As a matter of fact, there are only few DRM transmissions. Here we hear in the afternoon Kuwait radio, and in the evening RRI Radio Romenia International for an hour or so. The sw receiver - from which the drm-receiver is derived - was equipped with decoders for e.g. PSK, RTTY, etc well before a DRM decoder was built.
The currently implemented decoders are:
ssb, with selection for usb or lsb;
psk, with a wide selection of modes and settings and with a visual tuning aid;
mfsk, with a visual tuning aid;
rtty, with a wide selection of modes and settings;
cw, with (almost) automatic selection of speed and a visual tuning aid;
drm, limited to 10 k bandwidth;
navtex (amtor), with a wide selection of options;
weatherfax decoder, with selection of a variety of settings.
The releases section of this repository contains - next to the sources - a Windows installer and a for Linux-x64 an AppImage.