Caching

1. Types of Caches (Client-Side, Server-Side, Distributed)

Caching is a technique used to store copies of frequently accessed data in a location closer to the data consumer, reducing latency and improving performance.

1.1 Client-Side Cache

Client-side caching stores data on the user’s device, often in the browser or application cache.

  • Usage: Reduces server requests for static assets like images, stylesheets, and JavaScript files.
  • Example: Web browsers cache files using HTTP headers (Cache-Control, Expires).

Example Configuration (HTTP Header):

Cache-Control: max-age=3600, must-revalidate

1.2 Server-Side Cache

Server-side caching stores cached data on the server, providing fast access for multiple users accessing the same data.

  • Usage: Improves response times for database queries or frequently accessed data.
  • Example: Memcached and Redis are popular server-side caching solutions.

Example Code (Using Redis for Server-Side Caching in Python):

import redis

cache = redis.StrictRedis(host='localhost', port=6379, db=0)
# Set a value in the cache
cache.set('user:1000', '{"name": "Alice", "age": 30}')
# Get the cached value
print(cache.get('user:1000'))

1.3 Distributed Cache

A distributed cache stores data across multiple servers, allowing caching at a larger scale and high availability.

  • Usage: Ideal for distributed systems with high read loads or load balancing requirements.
  • Example: Amazon ElastiCache, Redis Cluster.

2. Cache Replacement Policies (LRU, LFU)

Cache replacement policies define how caches evict data when the cache storage limit is reached.

2.1 Least Recently Used (LRU)

LRU evicts the least recently accessed item when cache space is needed. It works well for data with temporal locality, where recently accessed items are more likely to be accessed again.

LRU Example in Python:

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.cache = OrderedDict()
        self.capacity = capacity
    def get(self, key):
        if key not in self.cache:
            return -1
        self.cache.move_to_end(key)
        return self.cache[key]
    def put(self, key, value):
        if key in self.cache:
            self.cache.move_to_end(key)
        self.cache[key] = value
        if len(self.cache) > self.capacity:
            self.cache.popitem(last=False)

cache = LRUCache(2)
cache.put(1, 1)
cache.put(2, 2)

print(cache.get(1))   # Output: 1

cache.put(3, 3)       # Removes the least recently used item (2)

print(cache.get(2))   # Output: -1

2.2 Least Frequently Used (LFU)

LFU evicts the least frequently accessed item. It’s useful in scenarios where items accessed less frequently are less likely to be accessed in the future.

3. Cache Invalidation Strategies

Cache invalidation is the process of removing or updating cache entries when the underlying data changes.

3.1 Cache Invalidation Strategies

Time-based Expiration (TTL): Each cache entry is valid for a specified time-to-live (TTL). After this period, the entry is invalidated.

  • Cache-Control: max-age\=300
  • Write-through: Updates the cache and the underlying data store simultaneously. Ensures that the cache always has the latest data, but may add write latency.
  • Write-behind (write-back): Data is first updated in the cache and then asynchronously written to the database. This is useful for reducing database loa,d but may cause data loss on cache failure.
  • Manual Invalidation: Application logic explicitly clears or updates the cache when data changes (e.g., on a new update or delete).

Example of Manual Cache Invalidation:

# Assume we have a cache with user profiles
def update_user_profile(user_id, new_data):
    database.update_user(user_id, new_data)
    cache.delete(f'user:{user_id}')  # Invalidate cache entry for updated user

Track your progress

Mark this subtopic as completed when you finish reading.