Caching Systems and Web Acceleration
In the beginning of the web, when websites consisted of just HTML files with more text and fewer images, the page loading speed was not yet a key factor for a good user experience.
However, when the web became dynamic, i.e. when websites began offering dynamically generated content, the page loading speed began directly to influence the good interactive web users experience. Apart from working with HTML files, the web serves were given the task of operating with script languages, which in turn operate in close cooperation with databases. The size and complexity of web information is increasing which leads to a lag in delivering the web content to the web users.
The server performs intensive operating tasks to load just a single page from a dynamic website. Scripts are running that process data, SQL queries to the database are being executed, as well as various other processes, just to generate the information the user is expecting.
The word ‘cache’ means ‘to store, hide’. And web cache means storing web information.
Although the information in web pages is now generated dynamically, it does not change every second. That means the same information is being distributed to all users, but for each user it will be generated anew. That process of dynamic generation of the same information is done all the time, upon every access. That leads to unnecessary use of resources such as time, CPU resources, RAM, and etc.
Because of this waste of time and resources, which directly affects the website loading time, certain technologies were created to store the generated information. After the dynamically generated information is saved (cached), it turns into static content. Such information is provided to users directly, without generation, waiting or lag.
Caching of information for upcoming use is an effective way to accelerate the loading of a dynamic website. When compression of such saved information is applied, the downloading time for the user is accelerated even more.
Web caching and compression:
• decreases the waiting time;
• decreases network traffic; decreases the amount of information transferred through the web;
• decreases the CPU usage and the web server load;
• facilitates faster access and faster web resources loading.
Caching Systems and Web Acceleration
The acceleration systems on the web (also called web accelerators) use information caching as part of their overall function. Apart from caching, such acceleration systems may feature compression, filtering, preliminary download of resources and others.
Any web information can be cached locally at the user’s end – Browser Cache, before the source server of the web content – Reverse Proxy/Web Accelerator or at the source server itself, at an application level.
Cache miss — (the information you are looking for is not available in the cache)
(1) The web browser sends a query to the server corresponding to the respective domain, about the page.html web resource, the query is detected by the proxy server;
(2) The proxy server checks the cache for the page and does not find it. Such omission can be called Cache miss.
(3) The proxy server sends a query regarding the resource to the source server.
(4) The source server returns the demanded web page to the proxy.
(5) The proxy server receives, copies the web resource and sends it back to the web client.
(6) The web browser receives the demanded web resource.
Cache hit — (the information you are looking for is available in the cache)
(A) The web browser sends a query about the page.html web resource to the server corresponding to the demanded domain, and the query is detected by the proxy server (since the proxy server is located before the source server).
(B) The proxy server checks the cache for the page and finds it – Cache hit. The proxy server supplies the web resource directly to the client, this time without going through the resource server.
What is Browser Cache?
The user’s web browser may cache objects from websites such as images, scripts, pages, etc. When a website is accessed, the web browser will first check if the objects on the website are not already available in the cache. If the objects are saved and are still up to date, the browser will load them from the cache, instead of connecting to the web server. Loading data from the web browser’s cache is much faster than downloading them from the web server. Sometimes, after implementing changes to a website, in order to see the latest information on the site, it may be necessary to clear the cached data in the web browser.
What is Reverse Proxy?
Reverse Proxy is a type of proxy server (a mediator between a web client and a web server) and one of its functions is caching. A Reverse Proxy may be placed before the web server and users will connect to the proxy server, instead of directly to the web server.
Some of the useful functions the Reverse Proxy may perform are the following:
• decreasing the web server load through effective caching of the static and dynamic content (web accelerator);
• decreasing the time for accessing the website;
• load distribution, if there are more than one web servers behind the proxy;
• protection against general web based attacks;
• content compression for quicker web page loading;
When working with virtual servers you can use any caching system and acceleration technique that is compatible with the platform, settings and the objectives of the web project.
When working with Cloud virtual servers, just as with ordinary virtual servers, you can use any type of systems and techniques to optimize and accelerate the web projects. The Cloud platform, apart from its key advantage, also has another, rather organizational advantage – within a single customer account you can create, reinstall and remove as much virtual servers as you wish. Apart from the VPS where your site is located, you can also create additional VPSs that would serve for certain accelerating, caching purposes. You can create Load Balancers, you can install web accelerators and proxy caching servers. These optimization tools may be placed before the main virtual server, in order to balance the load and accelerate the access and loading of the web project.
Caching technologies at app level
There are various systems and technologies for caching data internally, on the application server. Memcache, APC and eAccelerator are examples of such caching technologies.
Data caching is a reliable method for optimizing the speed of web apps, but it is not the only one. This method alone is not enough when other optimizations need to be applied at a lower level.
The Reverse Proxy caching approach towards information only works for website traffic generated by anonymous users. When the site is accessed by a user who is logged into the system, no data caching is applied.
Fun fact: Apart from the technologies and systems mentioned above, the main goal of which is to accelerate the web apps and the access to them, there are other available caching types.
For example, the Google search engine also caches web data. For every site visited by the Google bot, a cache version is created. That caching is especially useful in case a website is temporarily unavailable, because it can still be loaded through the cache.
Also, there are specific systems that create and manage a web archive of websites such as The Internet Archive.
More caching: DNS cache servers. The DNS information of a domain is saved onto the DNS servers of Internet providers. So, upon the next access to this domain, the cached DNS information is used, instead of sending queries to the DNS root servers.