HTTP compression

src: i.ytimg.com

HTTP compression is the ability that can be built into web servers and web clients to improve transfer speed and bandwidth usage.

HTTP data is compressed before it is sent from the server: the appropriate browser will announce what method is supported to the server before downloading the correct format; browsers that do not support the appropriate compression method will download uncompressed data. The most common compression schemes include gzip and deflate; however, the complete list of available schemes is managed by IANA. In addition, third parties develop new methods and incorporate them into their products, such as the Google Shared Dictionary Compression for HTTP (SDCH) scheme implemented in the Google Chrome browser and used on Google servers.

There are two different ways of compression that can be done in HTTP. At a lower level, the Transfer-Encoding header column may indicate the payload of compressed HTTP messages. At a higher level, the Content-Encoding header field may indicate that other transferred, cached, or referenced resources are compressed. Compression using Content-Encoding is more supported than Transfer-Encoding, and some browsers do not advertise support for Transfer-Encoding compression to avoid triggering bugs on the server.

Video HTTP compression

Negosiasi compression scheme

In many cases, excluding SDCH, negotiations are conducted in two steps, described in RFC 2616:

1. The web client advertises the compression schemes it supports by including a token list in the HTTP request. For Content-Encoding , list in the field called Accept-Encoding ; for Transfer-Encoding , this field is called TE .

2. If the server supports one or more compression schemes, the outgoing data can be compressed with one or more methods supported by both parties. If that is the case, the server will add the Content-Encoding or Transfer-Encoding field in the HTTP response to the scheme used, separated by commas.

The web server does not mean the obligation to use the compression method - it depends on the internal settings of the web server and may also depend on the internal architecture of the website in question.

In the case of SDCH, dictionary negotiations, which may involve additional steps, such as downloading the appropriate dictionary from an external server.

Maps HTTP compression

Content-Encoding token

The official token list available for servers and clients is managed by IANA, and it includes:

compress - UNIX "compress" the program method (historically, obsolete in most applications and replaced by gzip or deflate)
deflate - compression based on the deflate algorithm (described in RFC 1951), a combination of LZ77 and Huffman coding algorithms, wrapped in the zlib data format (RFC 1950);
exi - W3C Efficient XML Interchange
gzip - GNU zip format (described in RFC 1952). Using deflate algorithm for compression, but the data format and checksum algorithm is different from content-encoding "deflate". This method is most widely supported in March 2011.
identity - No transformation is used. This is the default value for content encoding.
pack200-gzip - Network Transfer Format for Java Archives
br - Brotli, a new open-source compression algorithm specifically designed for HTTP content encoding, implemented in Mozilla Firefox 44 release and Chromium 50 release.

In addition to this, some unofficial or non-standard tokens are used in the wild by servers or clients:

bzip2 - compression based on the free bzip2 format, powered by lighttpd
lzma - LZMA based compression (raw) is available in Opera 20, and elinks via the compile time option
peerist - Peer Content Caching and Retrieval Microsoft
sdch - Google Shared Dictionary Compression for HTTP, based on VCDIFF (RFC 3284)
xpress - the Microsoft compression protocol used by WindowsÃƒ, 8 and later for Windows Store app updates. LZ77-based compression optionally uses Huffman encoding.
xz - LZMA2-based content compression, supported by unofficial Firefox patches; and fully implemented in mget since 2013-12-31.

src: i.ytimg.com

Server that supports HTTP compression

SAP NetWeaver
Microsoft IIS: built-in or using third party modules
The Apache HTTP server, via mod_deflate (regardless of its name, only supports gzip)
HTI Server Hiawatha: presents pre-compressed files
Cherokee HTTP Server, On the fly gzip and compression deflation
Oracle iPlanet Web Server
Zeus Web Server
lighttpd, via mod_compress and later mod_deflate (1.4.42)
nginx - installed
Apps by Tornado, if "compress_response" is set to True in the application settings (for versions before 4.0, set "gzip" to True)
Jetty Server - built-in to presenting the default static content and available via the servlet filter configuration
GeoServer
Apache Tomcat
IBM Websphere
AOLserver
Ruby Rack, via Rack :: Deflater middleware
HAProxy
Varnish - installed. Works also with ESI

Compression in HTTP can also be achieved by using server side scripting language functionality such as PHP, or Java programming languages.

src: blog.youaresecure.be

The problem prevents the use of HTTP compression

Article 2009 by Google engineers Arvind Jain and Jason Glasgow stated that over 99 person-years are wasted every day due to increased page load time when users do not receive compressed content. This happens when anti-virus software interrupts connections to force them to uncompress, where proxies are used (with overly cautious web browsers), where servers are misconfigured, and where browser bugs stop compression being used. Internet Explorer 6, which is down to HTTP 1.0 (without features like compression or pipelining) when behind a proxy - a common configuration in a corporate environment - is the most vulnerable main browser failing to return to uncompressed HTTP.

Another problem that is found when deploying HTTP compression on a large scale is because of the definition of deflate encoding: while HTTP 1.1 defines deflate encoding as data compressed with deflate (RFC 1951) in zlib stream formatted (RFC 1950), Microsoft servers and client products have historically implemented it as a raw "raw" stream, making its deployment unreliable. For this reason, some software, including Apache HTTP Server, only implements gzip encoding.

src: slideplayer.com

Security implications

In 2012, a general attack on the use of data compression, called CRIME, was announced. While CRIME attacks can work effectively against a large number of protocols, including but not limited to TLS, and application-layer protocols such as SPDY or HTTP, only exploits against TLS and SPDY are shown and mostly mitigated in browsers and servers. CRIME hardness against HTTP compression has not been addressed at all, although the CRIME authors have warned that this vulnerability may be wider than the combined SPDY and TLS.

In 2013, a new example of CRIME attacks against HTTP compression, dubbed BREACH, has been published. BREACH attacks can extract login tokens, email addresses, or other sensitive information from TLS encrypted web traffic in just 30 seconds (depending on the number of bytes to be extracted), provided the attacker tricks the victim into visiting malicious web links. All TLS and SSL versions are at risk from BREACH regardless of the encryption algorithm or cipher used. Unlike previous examples of CRIME, which can be successfully defended against by turning off TLS compression or SPDY header compression, BREACH exploits HTTP compression that can not realistically be turned off, as almost all web servers rely on it to increase the speed of data transmission for users.

In 2016, TIME attacks and HEIST attacks are now public knowledge.

src: blog.youaresecure.be

References

src: tier1app.files.wordpress.com

External links

RFC 2616: Hypertext Transfer Protocol - HTTP/1.1
Value of HTTP Content-Coding by an Internet Defined Number Authority
Compression with lighttpd
Coding Horror: HTTP Compression on IIS 6.0
15 Seconds: Web Site Compression in the Wayback Machine (archived July 16, 2011)
HTTP compression: resource page by founder of VIGOS AG, Constantin Rack
Using HTTP Compression by Martin Brown from Server Watch
Using HTTP Compression in PHP
Dynamic and static HTTP compression with Apache httpd

Source of the article : Wikipedia