**Abstract:**

Watermarking is being used in a wide variety of applications. Steganography,
copyright protection, owner identification etc are some of them. But watermarking
can also be used as means to store other kind of useful information in the image.
This work discusses the advantages of putting such information into the image.

A watermarking algorithm suitable for embedding large amount of information
in the image, robust of jpeg compression is also presented.

**Members:**

- Sunil Mohan Adapa

**Guide: **Prof. J. Sivaswamy

**Duration: **4 months

**Introduction:**

Images convey more information than text, but images do not say everything. Additional information is always associated with an image. This information is sometimes put into the image as a comment. This is no better than keeping the information in a separate file or a different column in a database. Let us take a geographical map for example. Various places in the map are to be labeled. If one wants to do this without disturbing the original image, the solution is to keep the information separately and not etch it over the map. But if one crops the image to a particular section then will the information about that section be retained and the information about other sections of the map be lost? The solution to this is to keep the information as close to the image as possible. The information about a section of the image should preferably be stored in that same section of the image.

Many other applications require the same to be done. In medical applications, the diagnostic information about the image can be kept close to the image. Labeled parts of anatomy image can follow this method.

There are a number of distinct application areas for watermarking. Many watermarking methods have been proposed for each of them. Each of these watermarking methods have their own requirements and limitations.

**Copyright Protection and Fingerprinting:**

Ownership of digital media can be verified in case of a copyright dispute by using the embedded data as a proof. The requirements for this purpose are that the watermark should be robust. The watermark should tolerate malicious and unintentional attacks on the watermarked image. A number of watermarking techniques require that the original image be available during the watermark detection phase. These watermarking methods are called a private schemes. Despite their advantages and robustness, such methods are not suitable for various applications because of the requirement of the original image during the detection stage. Most of these methods just ``detect'' the watermark and verify the ownership. They are not a means to transmit useful data along with the image.

**Authentication or Tamper-proofing:**

When an image is transmitted over a channel, there is no guarantee that the received image is the unmodified original image. The task of authentication and tamper-proofing is to report any such changes in the image. Unlike copyright protection, the watermark that is embedded in the image should change with the significant change in the image. Such watermarks are fragile wateramarks.

**Steganography:**

Spies in other countries have to use local channels to communicate. They should assume that the channel will be monitored. Hence the message should be sent in such a way that the presence of the message is not revealed. Watermarking methods meant for such hidden transmission ensure that the presence of the message in the image is undetectable. These methods are not robust as robust watermarks are easily detectable.

**Captioning and Annotation:**These watermarking methods aim at embedding of descriptive information in the image. These methods are usually fragile and are not robust compression, cropping etc. Unlike all other watermarking methods these applications require moderately large information capacity. All other watermarking methods have information capacity of the order of few tens of bytes.

**Requirements:**

A watermarking algorithm that is used in embedding information should have a lot mutually exclusive properties. The watermarking method should not require the original image to be present during the watermark extraction phase. The watermarking method then falls in the category of ``blind'' watermarks. Blind watermarks usually lack robustness.

A very important property that the watermark should comply is robustness to jpeg compression. As jpeg is a widely used as an image format, the watermark should be able to embed the information in the jpeg compressed image. The watermark should adapt itself to various compression qualities. The watermark must have considerably large amount of information capacity. The watermark capacity should not depend on the size of the image or channels in the image. An application might have several kilobytes of information to be embedded in a small image. Ideally the size of the image should not decide the amount of information that can be put into the image. The size of the watermarked image would then increase with more and more watermark content in it.

The watermarking information about a part of the image should be embedded only in that part of the image. When the watermarked image undergoes operations such as cropping, the information that is associated with the selected part of the image should be retained and the information that concerns other parts should be removed.

For most applications the watermark should not introduce visible distortion in the image. Medical applications are the most sensitive. For some applications some distortion in the image content might be acceptable.

**Proposed Methods for Information Embedding:**

Watermarking methods meant for steganography can be used for embedding high amount of information. But most of them are not robust to compression.

**A Simple Method:**

One simple method that is not robust to compression is the LSB encoding. LSB encoding is very simple and has been used for a variety of purposes. In this method the last significant bit of every component (or the blue component, that is least observable) is replaced by the watermark information bit. This method can store quite some information, but the amount of information that can be embedded is still limited. This method completely looses to lossy jpeg compression.

Figure 1 shows LSB encoding.

*Figure 1: LSB encoding*

**Proposed Method:**

Jpeg encoding and decoding procedures consist of several steps. The source
is first divided into 8x8 blocks. These FDCT takes these blocks and input and
transforms them to the DCT domain. 64 DCT coefficients are obtained. There is
a one to one mapping from the original samples to the DCT coefficients. This
concentrates the data at the lower frequencies and most of the other coefficients
are close to zero. The coefficients that are close to zero need not be encoded.
At the decoder end the DCT domain coefficients are fed to IDCT. IDCT converts
them back to the spatial domain. Again a one to one mapping from the frequencies
to spatial domain samples is maintained, retrieving all the original sample
values. The output of the IDCT then forms the corresponding 8x8 block of the
reconstructed image. In principle, there is no loss in FDCT - IDCT process if
proper precision is maintained. This step does not do any compression itself
but lays the basis for it. Next the coefficients are fed to the quantizer. The
quantizer divides each coefficient with corresponding value from a quantization
table specified by the application. This step makes sure that the coefficients
are represented with no more precision that required. It discards all the information
from the image that is visually unnecessary. At the decoders end, the same quantization
tables values are multiplied with the quantized DCT image coefficients. This
scales the coefficients back to normal values but there is some loss in unwanted
information. The quantization coefficients are entropy coded. Entropy encoding
further compresses the coefficients based on their statistical nature.

Huffman coding is usually employed in this step. At the decoder end first step
is the entropy decoding, which results in quantized coefficients.

*Figure 2: Watermark insertion during jpeg encoding*

*Figure 3: Watermark extraction during jpeg decoding*

The jpeg encoder is lossy at various stages. First the DCT can lead to minor
loses due to insufficient precision while dealing with cosine coefficients.
When dct is applied on component values and immediately IDCT is applied on the
dct coefficients, the exact values are not restored due to error in storage
of the coefficients. Next dequantization is the major source of lossy nature
in the entire procedure. During dequantization the dct coefficients get divided
by fixed values taken from the quantization table. For higher qualities the
quantization table contains values that tend to become 1 (so that there is no
loss of information). For low qualities the quantization table contains high
values. So most of the coefficients become zero and information is lost to some
extent in the others. After dequantization, entropy coding will not be lossy.
If we embed our data in these quantized coefficients even the minute bit change
will be retained. Figure 2 and 3 show the watermarking procedure.

One way to do this is to use the LSB of the coefficients. The image after being
divided into 8x8 block undergoes the inter-component transform. Then dct is
applied. The obtained dct coefficients undergo quantization where they get divided
by values in the quantization table. It is in each of these that we embed our
information.

The last bit of the coefficients is replaced with an information bit. These
coefficients are then sent to the entropy coding step. When the image is compressed
for hight quality, the quantization tables has small vales. So the insertion
of watermark information will introduce only small error at the decoding step.
When the image is compressed for less quality, the quantization tables has large
vales. So the embedding of information (change of one bit) introduces a lot
of error after it has been dequantized. Hence this works with compression for
high quality and produces visible distortion with compression for low quality.
Figure 4 shows the order of embedding the watermark information in a typical
8x8 coefficient block. The gray colored squares indicate the coefficients selected
for embedding the watermark.

*Figure 4: Order of embedding the information and the coefficients selected
for embedding (shown in gray) when all the coefficients are considered*

**Method 1:**

In-order to overcome the problem, one can take the help of the decoder. The decoder now knows that the image has a watermark in it. After quantized coefficients are obtained from entropy decoding, the decoder removes the data from the LSBs and sets it to zero. The coefficients that were zero before the embedding of watermark data turn back to zero. The high frequency coefficients that usually become zero in jpeg do not have the error introduced. The results produced by this help from the decoder are quite acceptable. This method has considerable amount of information storage capacity. Also information about a particular region can be kept in the region on an 8x8 block level. When the watermark data is put in the LSB of the coefficients, many zero coefficients become non-zero. Jpeg encodes only those coefficients that are non-zero. Only coefficients from the first non-zero coefficient to the last non-zero coefficients are encoded. Thus, this method increases the no of coefficients to be encoded drastically. This will lead to some increase in file size of the image.

**Method 2:**

One more way to solve the problem is to use only the non zero coefficients to embed the watermark information. We embed watermark data in only the coefficients from the first non-zero coefficient to the last non-zero coefficient. The zero coefficients outside this range are not touched. No error is introduced in coefficients that are zero. The high frequency coefficients that usually become zero in jpeg do not have the error introduced. This method has very less information capacity because most of the coefficients become zero after quantization in even in medium quality. This method however does not introduces extra coefficients to be encoded. Hence there will be negligible change in the file size (even might decrease!). During this process some coefficients that are non-zero might turn to zero. And how would decoder know which is the first and last coefficients that contain the information (now that the zero boundary is lost)? So we have to use second non-zero coefficient to the last but one non-zero coefficients for this purpose. Figure 5 show the order of embedding the watermark information in a typical 8x8 coefficient block. The gray colored squares indicate the coefficients selected for embedding the watermark.

*Figure 5: Order of embedding of information and the coefficients selected
for embedding (shown in gray) when only non-zero coefficients are considered*

**Results:**

In-order to carry out the watermark embedding in the middle of the jpeg encoding
process, the free JPEG software from Independent JPEG Group has been used. This
encoder has been modified to take watermark information as an input specification
and embed the watermark after the quantization process as discussed.

A similar approach is taken for the decoder. The decoder reads the watermark
after the entropy decoding stage. It also outputs the watermark information
apart from the decompressed image. Several images at various compression qualities
have been tested. Figure 6 shows one of the images on which the experiments
are carried out.

*Figure 6: Original Image*

At high quality the watermarking procedure does not produce noticeable distortion
in the image. Figure 7 and 8 show the original image compressed at 90 quality
factor and the corresponding watermarked image. The size of the original image
at 90 quality factor is 10354 bytes. It was possible to embed the as much as
6808 bytes of watermark information. The size of the compressed image after
inserting the watermark at 90 quality factor is 19848 bytes.

*Figure 7: Original image at 90 quality factor*

*Figure 8: Watermarked image at 90 quality factor*

However at low quality the error introduced in high frequencies much larger and image is completely distorted. Figure 9 and 10 show the original image compressed at 90 quality factor and the corresponding watermarked image. The size of the original image at 50 quality factor is 4243 bytes. The watermark information is 6808 bytes. The size of the image after the watermark inserting the watermark at 50 quality factor is 16603 bytes.

*Figure 9: Original image at 50 quality factor*

*Figure 10: Watermarked image at 50 quality factor*

In-order to remove the error introduced in the high frequencies at low qualities, the decoder is modified to remove the watermarked information from the image. After entropy decoding the LSBs of the quantized coefficients are removed. These coefficients are then sent for dequantization and further processing. Figure 11 shows the image at 50 quality factor and the decompressed image from which watermark has been removed during decoding. There can be no change in the size of the image after watermark and the maximum amount of watermark data.

*Figure 11: Watermark information removed with decoder help at 50 quality
factor*

Without modifying the decoder, the distortion was attempted be to avoided. During the insertion of watermark, only coefficients that are non-zero are chosen for information embedding. Figure 12 shows the image at 50 quality factor in which only non-zero coefficients are chosen for watermarking. The amount of the information that can in be inserted has reduced. The size of the watermarked image is 4147 bytes.

*Figure 12: Image with watermark in only non-zero coefficients at 50 quality
factor*

**Conclusion:**

A method that can effectively store good amount of watermarked information, robust to jpeg compression has been given. This method suffers from loss of quality during the watermark insertion for low quality image. To overcome this difficulty two modifications to this method have been proposed. In the first method the decoder is given that the watermark is present in the image. This modification removes the distortion produced in the image during the encoding process. Another method which does not require the decoder knowledge of watermark has also been given. This has moderate information capacity.