Presenter : Moinul I Zaber, Dept.of CS, Kent State University Steganography and Approaches of Data...

Presenter : Moinul I Zaber,Dept.of CS, Kent State University

Steganography and Approaches of

Data Hiding in Digital Images

Based onF5 – a Steganographic algorithm,

by ,Andres Westfeld, Technische Universit¨at Dresden, Institute for System Architecture Dresden, Germany

Applications of data hiding in digital images,by Jessica Fridrich, Center for Intelligent Systems SUNY Binghamton, Binghamton, NY 13902-6000, U.S.A.

Steganalysis of JPEG Images: Breaking the F5 Algorithm, by Jessica Fridrich, Miroslav Goljan, Dorin Hogea, SUNY Binghamton, Binghamton, NY 13902-6000, USA

Prologue- Steganography is the art of invisible

communication. Its purpose is to hide the very presence of communication by embedding messages into innocuous-looking cover objects.

- Data Hiding in digital image is getting attention by cryptographers and security engineers

- Existing methods are not strong against attacks- Goal is to devise a method that is strong against

attacks and can control the image quality

Outline of the PresentationIntroduction and History of Data hidingDigital Image representation by Computer

technologyExiting data hiding methods (in Jpeg)

-JSteg (LSB modification)-F5 (Matrix embedding)

Discussions

Data Hiding in Digital Imagery• Relatively very young and fast growing• Well over 90% of all publications published in

the last 10 years • Highly multidisciplinary field combining image

and signal processing with cryptography, communication theory, coding theory, signal compression, and the theory of visual perception.

• Tremendous interest from industry and military

History of Data Hiding

• First techniques included invisible ink, secret writing using chemicals, templates laid over text messages, microdots, changing letter/word/line/paragraph spacing, changing fonts• Images, video, and audio files provide sufficient redundancy for effective data hiding• Postscript files, PDF files, and HTML can also be used for non-robust data hiding to a limited extent• Executable files, provide very little space for data hiding• Fonts can also be used

The Need for Data Hiding

• Covert communication using images (secret message is hidden in a carrier image)• Ownership of digital images, authentication, copyright• Data integrity, fraud detection, self-correcting images• Traitor-tracing (fingerprinting video-tapes)• Adding captions to images, additional information, such as subtitles, to video, embedding subtitles or audio tracks to video (video-in-video)• Intelligent browsers, automatic copyright information, viewing a movie in a given rated version• Copy control (secondary protection for DVD)

Covert communication

Copyright protection of images (authentication)

Fingerprinting (traitor-tracing)

Adding captions to images, additional information,such as subtitles, to videos

Image integrity protection (fraud detection)

Copy control in DVD

Intelligent browsers, automatic copyright information, viewing movies in given rated version

Requirements

Low High

capacityrobustness

invisibilitysecurity

embedding complexitydetection complexity

Requirements Application

Redundancy and Irrelevancy is needed for data hiding

2 gray levels

5 gray levels

31 gray levels

Original

+

+

+

=

=

=

• Carrier – message Relationship

• Who extracts the message? (source versus destination coding)

• How many recipients are there?

• Is the key a public knowledge or a shared secret?

• Do we embed different messages into one carrier?

• Embedding / detection bundled with a key in a tamper-proof hardware?

• Is the speed of embedding / detection important?

Definition of Data Hiding

Secretmessage

Embeddingalgorithm

Carrierdocument

Transmissionvia network Detector

SecretmessageKey

Key

Undetectability Robustness

Capacity

The “Magic” Triangle

There is a trade-offbetween capacity,invisibility, and robustness

Secure steganographictechniques

Digital watermarking

• Complexity of embedding / extraction• Security

Additional factors:

Naïve steganography

RobustnessThe ability to extract hidden information after common image processing operations: linear and nonlinear filters, lossy compression, contrast adjustment, recoloring, resampling, scaling, rotation, noise adding, cropping, printing / copying / scanning, D/A and A/D conversion, pixel permutation in small neighborhood, color quantization (as in palette images), skipping rows / columns, adding rows / columns, frame swapping, frame averaging (temporal averaging), etc.

UndetectabilityImpossibility to prove the presence of a hidden message. This concept is inherently tied to the statistical model of the carrier image. The ability to detect the presence doesnot automatically imply the ability to read the hidden message. Undetectability should not be mistaken for invisibility a concept related to human perception.

InvisibilityPerceptual transparency. This concept is based on the properties of the human visualsystem or the human audio system.

SecurityThe embedded information cannot be removed beyond reliable detection by targeted attacks based on a full knowledge of the embedding algorithm and the detector(except a secret key), and the knowledge of at least one carrier with hidden message.

Properties of hiding schemes

Detecting secret messages @The ability to detect secret messages in images

is related to the message length. Obviously, the less information we embed into the cover-image, the smaller the probability of introducing detectable artifacts by the embedding process.

@Each steganographic method has an upper bound on the maximal safe message length(or the bit-rate expressed in bits per pixel or sample) that tells us how many bits can be safely embedded in a given image without introducing any statistically detectable artifacts. Determining this maximal safe bit-rate (or steganographic capacity) is a non trivial task even for the simplest methods.

Choice of Cover-ImageThe choice of cover-images is important

because it significantly influences the design of the stego-system and its security.

Images with a low number of colors, computer art, images with a unique semantic content, such as fonts, should be avoided.

grayscale images are considered the best cover-images. uncompressed scans of photographs or images obtained with a digital camera containing a high number of colors, can be considers to be safest for steganography.

Choice of Image format@The choice of the image format also makes a very

big impact on the design of a secure steganographic system. Raw, uncompressed formats, such as BMP, provide the biggest space for secure steganography, but their obvious redundancy makes them

very suspicious.@Fridrich et al. have recently shown that cover-

images stored in the JPEG format are a very poor choice for steganographic methods that work in the spatial domain.

@Consequently, one should avoid using decompressed JPEG images as covers for spatial steganographic methods, such as the LSB embedding or its variants.

JPEG is the chosen one!The JPEG format attracted the attention of

researchers as the main steganographic format due to the following reasons:

It is the most common format for storing images, JPEG images are very abundant and they are almost solely used for storing natural images.

Modern steganographic methods can also provide reasonable capacity without necessarily sacrificing security. Pfitzmann and Westfeld proposed the F5 algorithm as an example of a secure but high capacity JPEG steganography. The authors presented the F5 algorithm as a challenge to the scientific community at the Fourth Information Hiding Workshop in Pittsburgh in 2001

Digital Image

Bitmap ImageBlack colors are

represented by 1White colors are

represented by 0

Image representation

Color Image

RGB Image:Numbers representing Images !!

Data HidingSo how do we actually hide data in to Digital Image?

•We can hide in the Pixel Value•We can hide in the Coefficient

LSB and how to get it?

LSB:3 modulus 2 = 14 modulus 2 = 0

LSB:Odd Values = 1Even Values= 0

JSteg Method

U. Derek. "Jsteg Staganographic Method,”

Jsteg Algorithm

LSB method is Secure?•LSB modification methods are not secure.

•Secret message is easily retrievableThe steganographic tool Jsteg embeds messages in lossy compressed

JPEG files. It has a high capacity—e. g., 12 % of the steganogram’s size—and, it is immune against visual attacks. However, a statistical attack discovers changes made by Jsteg

F5 Method

A. Westfeld, "F5: A steganographic algorithm: High capacity despite better steganalysis," in Lecture Notes in Computer Science, vol. 2137, pp. 289-302, (2001)

F5 Method (cont..2/2)

0

1

6

5''* 2modmxH

110

101H

432x

433"x

11m

"* xHm

1

1

7

7' 2modm

Steps of F5 Algorithm1. Get the RGB representation of the input image.2. Calculate the quantization table corresponding to quality factor Q and

compressthe image while storing the quantized DCT coefficients.3. Compute the estimated capacity with no matrix embedding C = hDCT – hDCT /64 – h(0) – h(1) + 0.49h(1), where hDCT is the number

of all DCT coefficients, h(0) is the number of AC DCT coefficients equal to zero, h(1) is the number of AC DCT coefficients with absolute value 1, hDCT/64 is the number of DC coefficients, and –h(1)+0.49h(1) = –0.51h(1) is the estimated loss due to shrinkage (see Step 5). The parameter C and the message length together determine the best matrix embedding.

4. The user-specified password is used to generate a seed for a PRNG that

determines the random walk for embedding the message bits. The PRNG is also

used to generate a pseudo-random bit-stream that is XOR-ed with the message to

make it a randomized bit-stream. During the embedding, DC coefficients and

coefficients equal to zero are skipped.

5. The message is divided into segments of k bits that are embedded into a group of

2k–1 coefficients along the random walk. If the hash of that group does not match

the message bits, the absolute value of one of the coefficients in the group isdecreased by one to obtain a match. If the coefficient becomes zero, the

event iscalled shrinkage, and the same k message bits are re-embedded in the next

groupof DCT coefficients (we note that LSB(d)= d mod 2, for d > 0, and LSB(d)=1–

dmod 2, for d < 0).

6. If the message size fits the estimated capacity, the embedding proceeds,otherwise an error message showing the maximal possible length is

displayed.There are rare cases when the capacity estimation is wrong due to a larger

thananticipated shrinkage. In those cases, the program embeds as much as

possibleand displays a warning.

Take Home Information @ we assume that the steganographic method

is publicly known with the exception of a secret key. The method is secure if the stego-images do not contain any detectable artifacts due to message embedding. In other words, the set of stego-images should have the same statistical properties as the set of cover-images.

@ If there exists an algorithm that can guess whether or not a given image contains a secret message with a success rate better than random guessing, the steganographic system is considered broken.

ConclusionDigital data hiding is the research topic of the

time.Existing algorithms are not secured.Protocol is known but key is unknown.F5 has High security due to matrix embeddingF5 has Less modification but less vulnerabilityF5 has high capacity and better steganalysisF5, is probably one of the most advanced

programs publicly available, it uses methods to compensate for the introduced changes so statistical analysis is difficult. Yet it is considered broken.

Demonstration of StegoMagic.Discussion

Presenter : Moinul I Zaber, Dept.of CS, Kent State University Steganography and Approaches of Data...

Documents

Transcript of Presenter : Moinul I Zaber, Dept.of CS, Kent State University Steganography and Approaches of Data...