A method for attaching structured information to noise data


Measurement data needs to be accompanied by additional information in order to know when it was recorded, which technical parameters were applied and what does it represent. One way for attaching those meta-data to measurement data in a structured form is the usage of a separate meta-data file like e.g. a XML file which relates somehow to the measurement file. Some problems can result from physical separation between recorded data and its meta-data, especially if there is a one-to-many relationship between the meta-data file and the measurement data files. Once the link between those items gets broken all measurement data will most likely become almost worthless.

Basic method

The here described method avoids the mentioned drawback since it attaches all additional information directly to the measurement data, remarkably without increasing the amount of data. The information will be imprinted into existing data and one can say that it is added below the noise level as described in the following.

The working principle of most analog-digital converters introduce an uncertainty of half a LSB (least sign bit) into the measurement value which is usually called "digital noise". When applying a method which statistically alters about one quarter of all the LSBs in the data stream one can say that the introduced changes are at the level of half of the inherent "digital noise".

Two subsequent LSBs are compared against each other. If they are equal then an information bit of zero is derived and if they are different then an information value of one will be the result. When imprinting information on an arbitrary digitized data stream there will be a probability of about 50% that the bits are set as needed as well as a comparable chance of also about 50% that one of the two bits needs to be toggled. This provides to about a quarter of all LSBs that need to be toggled.

Information imprinting procedure

Accordingly to the basic noise modulation method it will take 16 mesurement values for imprinting one information byte. Each two subsequent measurement values will deliver one information bit. Thus combining 8 couples of measurement values will provide to one information byte as it is shown in the following image:

The maximal amount of information bytes that can be imprinted into a given amount of measurement data using the described method can be calculated as follows:

Number of information bytes = Number of measurement values divided by 16

When designing a measurement data chunk size one needs to make the chunk size equal or bigger than the amount of data that follows from above equation:

Measurement data chunk >= 16 times the number of intended information bytes

NOTE: It seems to be amazing that there is room for e.g. 64 KByte of additional information in a 1 Mega value measurement data package without adding any additional byte to the data package and without loosing any significant information in the data obtained from an Analog-Digital-Converter (ADC) stage of a receiving system. The secret of this achievement is an exploitation of the so far unused digital noise which is inherent in digitized measurement data and which is now assigned a useful purpose.

Signal power considerations

Independently of how many least bits are toggled in a data stream it will always constitute an interference to the original noise signal. However, the lower the number of bits that are toggled the smaller the impact on the original signal will be. In order to get an idea of the introduced interference level one can estimate a ratio between the introduced interference and the value of the least bit by counting the number of toggled bits and subsequently multiplying it by the probability that those bits will actually be toggled.

Inerference = 20 log( 1/B * P )

where B is the number of bits between which at most one bit is toggled and P is the probability that the bit will actually be toggled.

For example, the above described procedure toggles one out of two bits with a probability of 50% which provides to an average interference level of -12 db compared to the value of the least bit of a measurement value. However, if in a worst case scenario every second bit is toggled the interference can reach a level of -6 dB which can already be at a non-acceptable level dependent on the overall system setup and parameters.

Fortunately there exist some more advanced methods of information imprinting into noise which will be described in the following.

Advanced information imprinting methods

When using a set of multiple bits for transferring information items, those bits are usually called a token. The token length along with the applied coding scheme will decide about the amount of information that can be transferred by a single token. In general there is a trade-off between a certain amount of redundancy in the coding scheme and the degree of freedom for achieving the intended features.

In case of imprinting information into noise signals there is usually a big amount of data available which can be used for merging information into it. Therefore also methods with a highly redundant coding scheme are in range. The main objective is to make as low as possible changes to the original data since those data files are intended for various further processing stages and the error in the outcome of later processing steps should be minimized.

On the other hand there is already an error from the uncertainty of the least bit resulting from the ADC working principle and when getting down to a sufficiently low interference level the impact of the information imprint on the main processing results can be made negligible, even below the ADC's noise. The following chart shows the principle of an information imprint into the least bit of a noise signal for a number of B informaiton bits.

Characteristics of the advanced imprinting methods

In opposite to the basic working principle that was described before the advanced methods derive information bits from more than two measurement values, in general from 2^B ( 2 to the Bth power) values. Due to the size of the necessary conversion table for running the described procedure in practical application the method is limited to the following values of B:

Information bits (B) Token length Measurement values
per info byte
Info in 1 Mega-values Interference level compared
to the least bit
2 4 16 64 KByte -14.5 dB (max -12 dB)
3 8 21.3 ~48 KByte -19.2 dB (max -18 dB)
4 16 32 32 KByte -24.6 dB (max -24 dB)

Implementation of the advanced imprinting methods

The implementation of the advanced imprinting methods is based on a highly redundant data assignment between a set of 2^(2^B) tokens and B information bits. The main idea is to generate the conversion table such that from every possible input vector of 2^B input bits an arbitrary token with the right bit assignment can be reached by toggling only one out of 2^B bits.

Those tables can be generated by an algorithm which assigns the output bits B to each of the 2^(2^B) input vectors by iterating through all input vectors accordingly to their values from 00..0 to 11..1 and by toggling one bit at a time of those input vectors. The result of that procedure for B = 2 can be seen below:

The first vertical column shows all possible combinations of the least bits of four input values. The body of the table shows the information value that is supposed to be imprinted on that input vector. When going first horizontally from the actual input vector to the intended information imprint and than vertically to the green value in the header one will find the token that needs to replace the input vector. Since each token has a unique assignment to an information value the extraction of the imprinted information will take place by using the assigned information value for each token.

When investigating the conversion table thoroughly one will find that there are redundant assignments following from the fact that toggling one of 2^B bits plus leaving the input vector untouched provides to 2^B + 1 possibilities for assigning one of 2^B values. Those redundant assignments could be used for example for trying to compensate for the toggled bits when calculating an average of all bits. This would reduce the impact of the introduced changes further though they are already on a level that can hardly be detected by any means when going e.g. for the most advanced B = 4 method.

Created: 2007-02-03 by Eckhard Kantz