What are data compression algorithms

Data compression

Image formats

Many image file formats use compression methods to reduce the storage space required by bitmap image data.

Compression methods are differentiated according to whether they remove details and colors from the image.

Lossless-Methods (lossless) compress image data without removing details.
Lossy-Methods (lossy) compress images by removing details.

The most common compression methods are:

Run length encoding (RLE)

  • The number of identical, consecutive pixels is summarized in a pixel line and thus a data reduction is achieved.
  • abstract example (in the original it is saved in binary form): 00011111111100000 becomes 309150
  • lossless
  • Practical application: Especially with repetitive structures such as graphics, clip art. Less suitable for "pixel images" as these do not have high repetition rates. Exception JPG after Fourier transformation (image structures are leveled).
  • Formats: Tif, Bmp, RLE (old Windows format), or as part of the JPG compression
  • Left:

Huffmann coding

  • Frequent tone values ​​are given a small binary coding, e.g. B. 0, 1 - rare tone values ​​are given a larger binary coding, e.g. B. 11111111
  • The Huffmann coding assumes that the distribution of the tonal values ​​is not uniform, but follows one (similar to a Gaussian curve).
  • The file header needs a transcoding table in order to be able to translate the encoded content.
  • lossless
  • very evenly distributed images (e.g. cyan wedge) more difficult
  • Practical application: CCITT (Group4) compression (PDF) in JPG compression, in MP3 compression

Lempel-Zif-Welch“ (LZW)

  • Comparison of image content: If information that has already been transmitted is repeated, it is not re-encoded, but a cross-reference is set to an existing image area.
  • lossless
  • Practical application: TIFF, PDF, GIF and PostScript supported compression method.
  • ideal for compressing images with large, monochrome areas or text
  • PNG compression is based on the same principle, but it was developed in a competitive manner and has prevailed due to a patent dispute (based on LZW).
  • Link: http://de.wikipedia.org/wiki/Lempel-Ziv-Welch-Algorithmus

Difference Pulse Code Modulation (DPCM)

  • It is not the tone value itself, but the difference to the next pixel that is encoded. Smaller numerical values, as the difference is often lower than the absolute value - resulting in data reduction.
  • lossless
  • Practical application: Audio compression and within JPG compression

Fourier transformation

  • Density gradients (e.g. within an image line) are represented as mathematical functions in the Fourier transformation. The individual pixel density values ​​are therefore converted to one another in a curve. This is done in sections of 8x8 pixels (artifact formation in the size of 8x8 pixels).
  • In order to achieve a higher compression, the overall curve function is adjusted - deflections are minimized. The amount of compression is adjustable (the JPG compression controller sets this compression).
  • lossy
  • Practical application: Inside the JPG compression

Joint Photographic Experts Group“ (JPEG)

  • is one of the formats supported by JPEG, TIFF, PDF and PostScript Lossy-Method. JPEG compression results in halftone images, e.g. B. Photos, for the best results.
  • The following compression methods are used in JPG compression:
    1. Conversion into the YUV color space (or YCbCr) - thereby separation of brightness and color information (brightness is perceived more sensitively by the eye than color)
    2. Fourier transformation of color (stronger) and brightness (weaker)
    3. DPCM coding
    4. Runlength coding
    5. Huffmann coding
    ATTENTION: 3rd + 4th at odds with the compendium, Wikipedia expresses it differently. Control.
  • Practical application: With JPEG compression, you specify the image quality by choosing an option from the “Quality” menu, moving the “Quality” slider, or by entering a value between 1 and 12 (10) in the “Quality” test field. Choose the compression with the highest quality to get the best print result. Files with JPEG encoding can only be output on PostScript level 2 printers (or higher) and may not be able to be separated into individual plates.


  • In the CCITTEncoding is a group of lossless, compression methods for black and white images supported by the PDF and PostScript file formats.
  • CCITT is the abbreviation of the French name for the International Committee for Telephony and Telephony, Comité Consultatif International Téléphonique et Télégraphique
  • see Huffmann coding above


  • The ZIPEncoding is a Lossless, compression method supported by the PDF and TIFF file formats. Like LZW, ZIP compression is best for images with large, single-colored areas.
  • ATTENTION: ZIP is more used as a container compression format than for images + not all programs that have… zip… in their name also use this compression. Please check.

(ImageReady) PackBits

  • is a lossless Compression method that uses a run-length compression scheme. PackBits is only supported by the TIFF format in ImageReady.

MPEG (Motion Picture Experts Group)

  • MPEG consists of intra-frame (frame = single picture) compression:
    Every single picture is compressed JPG.
  • and inter-frame compression: comparison of the image content of several frames. Only changed (moving) image content is re-encoded.
  • Example (Inter-…): A person stands and lifts his right arm. Only the moving right arm is coded. Then the arm stays up and the left arm is also taken up. Only the left arm (+ background of course) is coded.


The Preflight in layout programs checks whether the Image resolution, the Data depth, the Output color space does justice to the output medium. For example, a 72 dpi image is not suitable for printing. The Prefligt would output a corresponding error message.


The sampling rate, Number of channels and Frame rate is relevant for audio and video.

Here is the sampling rate the value that indicates how often a sound track is scanned in a certain time. The higher this rate, the clearer the sound will be later. It is given in Hertz, which stands for "1 per second". 44,000 KHz would therefore correspond to 44,000,000 samples per second.

Decreased If you change the value, fewer samples are saved in one second. Information goes through this lost.

An example:

A staircase has 100 steps that are easy to climb. If you double the number of steps (the sampling rate), the steps are only half as large. If you add as many levels as you like, you get the impression of a smooth surface and thus a very precise sound.

The Number of channels only indicates how many audio tracks are running next to each other in a video or piece of music. It is one for mono, two for stereo, six for 5.1 surround sound and eight for 7.1 surround sound. Compression can be achieved when channels are removed. For example, two audio tracks (stereo) can be converted into one (mono). The impression of the spatial sound is lost reduction of audio channels is so lossy.

The Frame rate indicates how many frames per second are encoded in a video signal.

25 frames per second (FPS), 29.97 FPS, 30 FPS and 60 FPS are common here. A compression can be achieved here if you look at the frame rate reduced, i.e. fewer images per second than in the source material. However, this also includes information lost.