Design a TTL based in-memory cache in golang

Introduction

Caches are data storage layer that are used to avoid doing operation that is expensive or hard to compute.

The following examples can be characterised as an expensive operation:

  • Data which is to be fetched from database, resulting in network calls and usage of precious database resources.
  • Data which is calculated using the response from multiple API calls to different services.
  • Cryptographic operations which take a lot of computing resources.

Essentially, caches are like a water jug kept on the dining table. They will be replenished from tap water when they are empty. Having a jug reduces the effort of multiple people going to get water.

Similarly, as a developer, we choose to store the data of expensive operation in cache and evict the data from cache depending on the specified eviction policy. The eviction policy could be of multiple kinds – LRU, LFU and time-based eviction.

We are going to talk about time-based eviction now. Simply put, we store the data for a specified duration and after this time is elapsed, we expire the data. We will take inspiration from the wonderful yet simple library – patrickmn/go-cache: An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications. (github.com). We will try to explain the code written here and try to make a simplified version of it.

Before we dive down to see the code, its good to see the cache in action. So, go ahead and use it in your code or just see the code in usage here – patrickmn/go-cache: An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications. (github.com)

Architecture

Caches are usually a combination of two components:

  • Storage layer to store the key-value pair.
  • Eviction layer to remove the data after the eviction condition is met. In our case, we are going to use time-based eviction.

Now, we will see how the library has implemented the two components. We will start with the data structures.

Item

type Item struct {
	Object     interface{}
	Expiration int64
}

Item is used to store the value part of key-value pair. Item struct has two fields. Object which is an interface. Using an interface allows us to store any kind of value. Expiration is a field to store the Unix time after which the key-value pair should not be visible.

Janitor

type janitor struct {
	Interval time.Duration
	stop     chan bool
}

Think of Janitor like a janitor in real life. Janitor struct has two components. Interval is time duration after which janitor periodically comes to clean up the storage. It purges the data which is expired. Stop is a channel which is used to inform janitor that cleaning is no longer needed.

cache

type cache struct {
	defaultExpiration time.Duration
	items             map[string]Item
	mu                sync.RWMutex
	onEvicted         func(string, interface{})
	janitor           *janitor
}

Cache struct lies at the heart of this library. It stores the following fields:

  • defaultExpiration – Time after which the key-value (KV in short) pair will be expired if the key-value pair is not set with its own expiry time.
  • items is a map with string as key and Item as the value.
  • mu is a lock. Since map is not thread-safe, we cannot guarantee the behaviour of a map in case of concurrent write operation. Lock helps in ensuring that map is not accessed by two different threads for write at the same time. sync.RWMutex has two different kinds of lock – Lock() and RLock(). Lock() allows only one goroutine to read and write at a time. RLock() allows multiple goroutines to read but not write at the same time. Read more about it here – go – what is the difference between RLock() and Lock() in Golang? – Stack Overflow
  • onEvicted is a function that is passed by the user, which acts like a call back. This function is called whenever there is eviction of data. One use of the method could be to call a method that replenish the cache from database to prevent the data from getting stale.
  • janitor – janitor, as described above, is a cleaner to periodically purge data from the cache after a specified duration.

Cache

type Cache struct {
	*cache
}

A field declared with a type but no explicit field name is an anonymous field, also called an embedded field or an embedding of the type in the struct. An embedded type must be specified as a type name T or as a pointer to a non-interface type name *T, and T itself may not be a pointer type. The unqualified type name acts as the field name.

But important question is – Why does this struct just have a cache pointer? We will try to find the answer later.

Now that we have seen the important structs, let’s have a look at the important methods.

New(…)

func New(defaultExpiration, cleanupInterval time.Duration) *Cache {
	items := make(map[string]Item)
	return newCacheWithJanitor(defaultExpiration, cleanupInterval, items)
}

This is the method used by developers to instantiate a new cache. It has two parameters:

  • defaultExpiration – default ttl of the KV pair, if it is not set with its own ttl.
  • cleanupInterval – time interval after which janitor purges the expired data.

The method returns a pointer to the cache. This method calls newCacheWithJanitor method.

func newCacheWithJanitor(de time.Duration, ci time.Duration, m map[string]Item) *Cache {
	c := newCache(de, m)
	C := &Cache{c}
	if ci > 0 {
		runJanitor(c, ci)
		runtime.SetFinalizer(C, stopJanitor)
	}
	return C
}

This method initialises c, a struct of type cache, and then initializes Cache. The method runJanitor runs a goroutine on c. Ticker is initialised with the purge duration and we have select waiting on a channel for the ticks to be delivered after purge duration. Once the ticks are delivered, DeletedExpired method is called. Tickers are used when you want to do something repeatedly at regular intervals. Tickers are built-in library in Golang.

func (j *janitor) Run(c *cache) {
	ticker := time.NewTicker(j.Interval)
	for {
		select {
		case <-ticker.C:
			c.DeleteExpired()
		case <-j.stop:
			ticker.Stop()
			return
		}
	}
}

func runJanitor(c *cache, ci time.Duration) {
	j := &janitor{
		Interval: ci,
		stop:     make(chan bool),
	}
	c.janitor = j
	go j.Run(c)
}

Since the janitor is working in a goroutine on c, an object of the cache struct, it will never be available for garbage collection. Hence, Cache struct is designed to have cache as a field. If Cache struct is garbage collected, stopJanitor is called using the runtime.setFinalizer. runtime.setFinalizer is used to call a function, here, stopJanitor as the first operand, and here C is garbage collected.

Add(…)

func (c *cache) Add(k string, x interface{}, d time.Duration) error {
	c.mu.Lock()
	_, found := c.get(k)
	if found {
		c.mu.Unlock()
		return fmt.Errorf("Item %s already exists", k)
	}
	c.set(k, x, d)
	c.mu.Unlock()
	return nil
}

func (c *cache) set(k string, x interface{}, d time.Duration) {
	var e int64
	if d == DefaultExpiration {
		d = c.defaultExpiration
	}
	if d > 0 {
		e = time.Now().Add(d).UnixNano()
	}
	c.items[k] = Item{
		Object:     x,
		Expiration: e,
	}
}

Add method adds the key-value pair if it is not already stored or it is expired. c.mu.Lock() allows only one goroutine to access the code block that it locks. This is important because items map in cache is not thread-safe. We are storing the expiry time in Unix time in nanoseconds.

Get(…)

func (c *cache) Get(k string) (interface{}, bool) {
	c.mu.RLock()
	item, found := c.items[k]
	if !found {
		c.mu.RUnlock()
		return nil, false
	}
	if item.Expiration > 0 {
		if time.Now().UnixNano() > item.Expiration {
			c.mu.RUnlock()
			return nil, false
		}
	}
	c.mu.RUnlock()
	return item.Object, true
}

Get method is used by the user to get the value, given the key, if it is available in cache. Please notice that, Get metod uses RLock() as it allows multiple goroutines to access the code block being locked for read but not for write.

Key-Value pair is not removed immediately after the expiry time is over. DeleteExpired() method runs periodically after the purge interval, and it deletes the expired key-value pair. Until then, get method checks for the expiry by comparing current time with expiry time.

func (c *cache) DeleteExpired() {
	var evictedItems []keyAndValue
	now := time.Now().UnixNano()
	c.mu.Lock()
	for k, v := range c.items {
		// "Inlining" of expired
		if v.Expiration > 0 && now > v.Expiration {
			ov, evicted := c.delete(k)
			if evicted {
				evictedItems = append(evictedItems, keyAndValue{k, ov})
			}
		}
	}
	c.mu.Unlock()
	for _, v := range evictedItems {
		c.onEvicted(v.key, v.value)
	}
}

Like the post? Please subscribe to the blog to get regular updates.

Implement your own cache with time based eviction

package main

import (
	"fmt"
	"runtime"
	"sync"
	"time"
)

const (
	INFINITY = -1
	DEFAULT  = 0
)

func main() {
	fmt.Println("Hello World")
	cache := New(10*time.Hour, 20*time.Minute)
	fmt.Println(cache.defaultExpiryDuration)
	fmt.Println(cache.kvstore)
	cache.Set("foo", "bar", 2*time.Minute)
	fmt.Println(cache.kvstore)
	value, found := cache.Get("foo")
	if found {
		fmt.Println("Value is ", value)
	}
}

type Data struct {
	Value    interface{}
	ExpireAt int64
}

type Cleaner struct {
	Interval time.Duration
	stop     chan bool
}

type cache struct {
	defaultExpiryDuration time.Duration
	kvstore               map[string]Data
	locker                sync.RWMutex
	cleaner               *Cleaner
	onRemoval             func(string, interface{})
}

type Cache struct {
	*cache
}

func New(defaultExpiryDuration time.Duration, cleanUpInterval time.Duration) *Cache {
	if defaultExpiryDuration == 0 {
		defaultExpiryDuration = INFINITY
	}

	cache := &cache{
		defaultExpiryDuration: defaultExpiryDuration,
		kvstore:               make(map[string]Data),
	}

	Cache := &Cache{cache}

	if cleanUpInterval > 0 {
		clean(cleanUpInterval, cache)
		runtime.SetFinalizer(Cache, stopCleaning)
	}
	return Cache
}

func clean(cleanUpInterval time.Duration, cache *cache) {
	cleaner := &Cleaner{
		Interval: cleanUpInterval,
		stop:     make(chan bool),
	}

	cache.cleaner = cleaner
	go cleaner.Cleaning(cache)

}

func (c *Cleaner) Cleaning(cache *cache) {
	ticker := time.NewTicker(c.Interval)

	for {
		select {
		case <-ticker.C:
			cache.purge()
		case <-c.stop:
			ticker.Stop()

		}
	}
}

func stopCleaning(cache *Cache) {
	cache.cleaner.stop <- true
}

func (cache *cache) purge() {
	now := time.Now().UnixNano()
	for key, data := range cache.kvstore {
		if data.ExpireAt < now {
			delete(cache.kvstore, key)
		}
	}
}

func (c *cache) Set(key string, value interface{}, expiryDuration time.Duration) {
	if expiryDuration == DEFAULT {
		expiryDuration = c.defaultExpiryDuration
	}
	var expireAt int64

	if expiryDuration > 0 {
		expireAt = time.Now().Add(expiryDuration).UnixNano()
	}
	c.locker.Lock()
	defer c.locker.Unlock()
	c.kvstore[key] = Data{
		Value:    value,
		ExpireAt: expireAt,
	}
}
func (c *cache) Get(key string) (interface{}, bool) {
	c.locker.RLock()
	defer c.locker.RUnlock()
	data, found := c.kvstore[key]
	if !found {
		return nil, false
	}

	if data.ExpireAt < time.Now().UnixNano() {
		return nil, false
	}

	return data.Value, true
}

Here’s a simplified version of TTL based cache. It would be recommended to design and implement it yourself first.

If you liked this article and would like one such blog to land in your inbox every week, consider subscribing to our newsletter: https://skillcaptain.substack.com

Leave a Reply

Blog at WordPress.com.

Up ↑