A 12-megapixel image measuring 4000 × 3000 contains 12 million pixels. At three bytes per pixel—one byte each for red, green, and blue—the simplest RGB representation is 36 MB before metadata or padding. A JPEG can often make a photograph dramatically smaller because it stores a compact recipe for rebuilding a close approximation, not a literal list of every original RGB value.

12 MP raw RGB36 MB
Basic recipe24 bits / pixel

Compress a generated test image

The browser draws the source scene, simulates the selected chroma sampling, then passes it through its JPEG encoder. Watch small text, saturated edges, gradients, and block boundaries as you change the parameters.

No image download
Generated source
Decoded JPEG
Edge magnifier
Encoded payloadCalculating…
Raw RGB equivalent421.9 KiB
Approx. reduction
Pipeline4:2:0 · Q72

Encoding the generated scene locally…

Browser encoders choose their own quantization tables and internal settings, so “quality 72” is not a universal JPEG recipe. File sizes are real for this generated 480 × 300 canvas.

The compression assembly line

There are variations, but the familiar lossy JPEG path can be understood as five transformations. Each one changes the question from “what color is this exact pixel?” toward “what visible structure is present in this small region?”

01 · PixelsRGB samplesdirect color values
02 · ColorLight + colorYCbCr channels
03 · SamplingLess color detailoften 4:2:0
04 · Blocks8 × 8 frequenciesvia the DCT
05 · BytesQuantize + packdiscard, reorder, code
A conceptual baseline-JPEG pipeline. Real encoders make many choices inside these stages.

First separate brightness from color

JPEG commonly converts RGB into YCbCr: one channel for luma-like brightness detail and two channels for color differences. Human vision usually notices fine brightness edges more readily than equally fine color variation, so an encoder can keep luma at full detail while sampling chroma more sparsely. In common 4:2:0 sampling, a group of four luma samples shares one sample from each chroma channel.

Y · brightness
Cb · blue difference
Cr · red difference
YCbCr reorganizes color into a detailed luma-like plane and two color-difference planes that can be sampled differently.
Compression is not merely “lower quality.” It is a budget. JPEG spends more of that budget on the kinds of spatial detail we tend to notice and less on the kinds we tend to forgive.

Then turn pixels into frequencies

The image is divided into 8 × 8 sample blocks. A discrete cosine transform (DCT) expresses each block as a weighted mixture of 64 patterns: one average value, followed by increasingly rapid changes across the block. Smooth skies concentrate their energy near the low-frequency corner. Hair, grass, and noise spread useful information farther across the grid.

Pixel space

An 8 × 8 patch is described as 64 brightness or color samples at fixed locations.

Frequency space

The same patch becomes 64 coefficients, usually strongest near the low-frequency corner.

The DCT reorganizes information; it does not itself have to lose any. The deliberate loss arrives next.

Quantization is the bargain

Each DCT coefficient is divided by a value from a quantization table and rounded. Small high-frequency coefficients often become zero. Larger divisors make a smaller file but throw away more subtle variation. This is the irreversible step behind a typical JPEG “quality” control—and that quality number is an encoder-specific shortcut, not a universal percentage.

The remaining values are read in a zigzag order that tends to place long runs of zeros together. Run-length and entropy coding can then represent common values with fewer bits. Decompression reverses the coding, multiplies by the quantization values, applies the inverse DCT, and converts the result back toward RGB. The discarded detail cannot return; the decoder reconstructs its best available approximation.

What the mistakes look like

Block boundariesBlocking

Strong compression can reveal the 8 × 8 working grid.

Sharp edgesRinging

High-contrast borders can grow faint ripples or halos.

Fine colorChroma blur

Small colored text and saturated edges can look softer than their brightness detail.

Raw is not one thing

A camera RAW file is usually not a simple RGB bitmap. It may contain sensor mosaic values, metadata, previews, and sometimes lossless or lossy compression of its own. “Raw RGB” here means the useful thought experiment: storing final pixel samples directly. Likewise, JPEG is both a family of coding modes and, in everyday speech, a file carrying a familiar DCT-based JPEG stream.

Lossless takes a different bargain

Lossless compression must reproduce the exact original sample values. PNG filters each scanline to make neighboring values easier to compress, then uses the DEFLATE algorithm—a combination of repeated-string references and entropy coding. Flat graphics, text, repeated patterns, and transparency often compress well. Photographic noise does not repeat politely, so a lossless photograph can remain much larger than a visually similar lossy image.

01 · SamplesPixel rowsexact RGBA values
02 · FilterPredict neighborsstore differences
03 · MatchFind repeatsback references
04 · CodeShorten symbolsHuffman codes
05 · ChunksPackage + verifymetadata + checks
PNG is lossless: after decoding and undoing the scanline filters, every stored sample returns exactly.

One picture, several kinds of machinery

JPEG

DCT-based lossy coding excels at ordinary photographs and broad compatibility. Classic JPEG has no alpha channel.

lossyphoto8×8
PNG

Filtered scanlines plus DEFLATE preserve exact samples and full alpha. Strong for graphics, UI, and screenshots.

losslessalphagraphics
WebP

A RIFF container can carry VP8-derived lossy images, a separate lossless mode, alpha, metadata, and animation.

lossylosslessanimation
AVIF

Stores AV1-coded image items in an ISO base-media structure. Supports lossy or lossless coding, HDR, wide color, alpha, and sequences.

AV1 toolsHDRalpha
GIF

A palette format using LZW compression. Its 256-color limit and one-bit transparency are restrictive; simple frame animation made it culturally durable.

palettelossless indicesanimation
SVG

XML instructions describe paths, shapes, text, gradients, and filters. It scales cleanly because it stores a scene, not a fixed pixel grid.

vectorDOMscalable

RGB is only part of the picture

An alpha channel describes coverage or opacity. Straight alpha stores color independently from alpha; premultiplied alpha stores color already multiplied by coverage, which can make compositing numerically convenient. Formats and graphics APIs must agree on interpretation or translucent edges can grow dark or bright fringes.

Red
Green
Blue
Alpha
Alpha is not “transparent color.” It is extra information used when compositing the stored color over a background.

Bit depth and color space change what values mean

Eight bits per channel offer 256 code values per channel; ten or twelve bits provide finer steps. But bit depth alone does not define visible color. Color primaries describe the gamut, a transfer function maps encoded values to light, and metadata tells software how to interpret them. HDR combines greater range with appropriate color and transfer characteristics—not merely a larger file.

3-bit illustrationvisible steps
8-bit SDR256 codes / channel
10-bit HDR-capable1,024 codes / channel
The first ramp exaggerates banding so the role of code-value precision is visible on an ordinary display.

Choose by what must survive

NeedUseful starting pointWhy
Broadly compatible photographJPEGSimple delivery and mature decoding everywhere
Exact UI, screenshot, or transparencyPNGLossless samples and alpha
Modern mixed web imageryWebPLossy, lossless, alpha, and animation in one family
High compression, HDR, wide gamutAVIFModern AV1 image tools and rich color support
Logos and diagramsSVGResolution-independent scene instructions
Editing latitude from a cameraCamera RAWSensor-oriented data and capture metadata

References

Keep wandering