Cache-Aside旁路缓存

Load data on demand into a cache from a data store. This can improve performance and also helps to maintain consistency between data held in the cache and data in the underlying data store.

将数据从存储加载到缓存中。 这可以提高性能,并且还有助于保持缓存中保存的数据与底层数据存储中的数据之间的一致性。

Context and problem

Applications use a cache to improve repeated access to information held in a data store. However, it's impractical to expect that cached data will always be completely consistent with the data in the data store. Applications should implement a strategy that helps to ensure that the data in the cache is as up-to-date as possible, but can also detect and handle situations that arise when the data in the cache has become stale.

应用程序使用缓存来改进对数据存储中保存的信息的重复访问。 然而,期望缓存的数据将始终与数据存储中的数据完全一致是不切实际的。 应用程序应实施有助于确保缓存中的数据尽可能最新的策略,并能够检测和处理缓存中的数据变得过时的情况。

Solution

Many commercial caching systems provideread-through and write-through/write-behind operations. In these systems, an application retrieves data by referencing the cache. If the data isn't in the cache, it's retrieved from the data store and added to the cache. Any modifications to data held in the cache are automatically written back to the data store as well.
许多商业缓存系统提供读写和写/写操作。 在这些系统中,应用程序通过引用缓存来检索数据。 如果数据不在缓存中,则从数据存储中检索数据,并将其添加到缓存中。 对缓存中保存的数据的任何修改也会自动写回数据存储区。

For caches that don't provide this functionality, it's the responsibility of the applications that use the cache to maintain the data.
对于不提供此功能的缓存,使用缓存的应用程序负责维护数据。

An application can emulate the functionality of read-through caching by implementing the cache-aside strategy. This strategy loads data into the cache on demand. The figure illustrates using the Cache-Aside pattern to store data in the cache.
应用程序可以通过实现旁路缓存策略来模拟直读缓存的功能。 此策略将数据按需加载到缓存中。 下图说明了使用Cache-Aside模式将数据存储在缓存中。

If an application updates information, it can follow the write-through strategy by making the modification to the data store, and by invalidating the corresponding item in the cache.
如果应用程序更新信息,则可以通过对数据存储进行修改,并使缓存中的相应项目无效,从而遵循直写策略。

When the item is next required, using the cache-aside strategy will cause the updated data to be retrieved from the data store and added back into the cache.
当下一个项目需要时,使用cache-aside策略将导致更新的数据从数据存储中检索并添加到高速缓存中。

Issues and considerations

Consider the following points when deciding how to implement this pattern:

Lifetime of cached data. Many caches implement an expiration policy that invalidates data and removes it from the cache if it's not accessed for a specified period. For cache-aside to be effective, ensure that the expiration policy matches the pattern of access for applications that use the data. Don't make the expiration period too short because this can cause applications to continually retrieve data from the data store and add it to the cache. Similarly, don't make the expiration period so long that the cached data is likely to become stale. Remember that caching is most effective for relatively static data, or data that is read frequently.
缓存数据的生命周期。 许多高速缓存实现一个到期策略,使数据无效,如果在指定的时间段内没有访问,则将其从缓存中删除。 为了使缓存有效,请确保到期策略与使用数据的应用程序的访问模式相匹配。 不要使过期期间太短,因为这可能导致应用程序不断从数据存储中检索数据并将其添加到缓存。 同样,不要让到期期限长到缓存的数据可能变得陈旧。 请记住,缓存对于相对静态的数据或经常读取的数据是最有效的。

Evicting data. Most caches have a limited size compared to the data store where the data originates, and they'll evict data if necessary. Most caches adopt a least-recently-used policy for selecting items to evict, but this might be customizable. Configure the global expiration property and other properties of the cache, and the expiration property of each cached item, to ensure that the cache is cost effective. It isn't always appropriate to apply a global eviction policy to every item in the cache. For example, if a cached item is very expensive to retrieve from the data store, it can be beneficial to keep this item in the cache at the expense of more frequently accessed but less costly items.
数据逐出。 与数据来源的数据存储相比,大多数缓存的大小都有限,如果需要,它们将会排除数据。 大多数缓存采用最近最少使用的策略来选择要逐出的项目,但这应该是可定制的。 配置缓存的全局过期属性和其他属性以及每个缓存项的到期属性,以确保缓存具有成本效益。 将全局逐出策略应用于缓存中的每个项目并不总是适合的。 例如,如果缓存项从数据存储中检索非常昂贵,则将该项而不是那些频繁访问但检索成本较低的项保留在缓存中是有益的。

Priming the cache. Many solutions prepopulate the cache with the data that an application is likely to need as part of the startup processing. The Cache-Aside pattern can still be useful if some of this data expires or is evicted.
启动缓存。 许多解决方案使用应用程序可能需要的数据预先填充缓存,作为启动处理的一部分。 如果某些数据到期或被驱逐,Cache-Aside模式仍然有用。

Consistency. Implementing the Cache-Aside pattern doesn't guarantee consistency between the data store and the cache. An item in the data store can be changed at any time by an external process, and this change might not be reflected in the cache until the next time the item is loaded. In a system that replicates data across data stores, this problem can become serious if synchronization occurs frequently.
一致性。 实现Cache-Aside模式不能保证数据存储和缓存之间的一致性。 数据存储中的项目可以随时通过外部进程进行更改,并且在下次加载项目前,此更改可能不会反映在缓存中。 在跨数据存储复制数据的系统中,如果同步频繁发生,则此问题可能会变得严重。

Local (in-memory) caching. A cache could be local to an application instance and stored in-memory. Cache-aside can be useful in this environment if an application repeatedly accesses the same data. However, a local cache is private and so different application instances could each have a copy of the same cached data. This data could quickly become inconsistent between caches, so it might be necessary to expire data held in a private cache and refresh it more frequently. In these scenarios, consider investigating the use of a shared or a distributed caching mechanism.
本地(内存中)缓存。 缓存可以是应用程序实例的本地,并存储在内存中。 如果应用程序重复访问相同的数据,缓存在此环境中可能很有用。 然而,本地缓存是私有的,因此不同的应用程序实例可以具有相同缓存数据的副本。 这些数据可能会在高速缓存之间迅速变得不一致,因此使保存在私有缓存中的数据过期,并更频繁地刷新数据变得非常有必要。 在这些场景下,请考虑调查使用共享或分布式缓存机制。

When to use this pattern

Use this pattern when:

  • A cache doesn't provide native read-through and write-through operations.
  • Resource demand is unpredictable. This pattern enables applications to load data on demand. It makes no assumptions about which data an application will require in advance.
  • 缓存不提供本机的直读和直写操作。
  • 资源需求是不可预知的。 此模式使应用程序能够按需加载数据。 它不会提前预测应用程序需要哪些数据。

This pattern might not be suitable:

  • When the cached data set is static. If the data will fit into the available cache space, prime the cache with the data on startup and apply a policy that prevents the data from expiring.
  • For caching session state information in a web application hosted in a web farm. In this environment, you should avoid introducing dependencies based on client-server affinity.
  • 当缓存的数据集是静态的。 如果数据将适合可用的缓存空间,请使用启动时的数据引导缓存,并应用防止数据过期的策略。
  • 用于在Web场中托管的Web应用程序中缓存会话状态信息。 在这种环境中,您应该避免根据客户端 - 服务器的亲和性引入依赖关系。

Example

In Microsoft Azure you can use Azure Redis Cache to create a distributed cache that can be shared by multiple instances of an application.
To connect to an Azure Redis Cache instance, call the static Connect method and pass in the connection string. The method returns a ConnectionMultiplexer that represents the connection. One approach to sharing a ConnectionMultiplexer instance in your application is to have a static property that returns a connected instance, similar to the following example. This approach provides a thread-safe way to initialize only a single connected instance.
在Microsoft Azure中,您可以使用Azure Redis Cache创建可由应用程序的多个实例共享的分布式缓存。
要连接到Azure Redis Cache实例,请调用静态Connect方法并传入连接字符串。 该方法返回表示连接的ConnectionMultiplexer。 在应用程序中共享ConnectionMultiplexer实例的一种方法是具有返回连接实例的静态属性,类似于以下示例。 这种方法提供了线程安全的方法来初始化一个连接的实例。

private static ConnectionMultiplexer Connection;
// Redis Connection string info
private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
{
string cacheConnection = ConfigurationManager.AppSettings["CacheConnection"].ToString();
return ConnectionMultiplexer.Connect(cacheConnection);
});
public static ConnectionMultiplexer Connection => lazyConnection.Value;

The GetMyEntityAsync method in the following code example shows an implementation of the Cache-Aside pattern based on Azure Redis Cache. This method retrieves an object from the cache using the read-though approach.
以下代码示例中的GetMyEntityAsync方法显示了基于Azure Redis Cache的Cache-Aside模式的实现。 此方法使用只读方法从缓存中检索对象。

An object is identified by using an integer ID as the key. The GetMyEntityAsync method tries to retrieve an item with this key from the cache. If a matching item is found, it's returned. If there's no match in the cache, the GetMyEntityAsync method retrieves the object from a data store, adds it to the cache, and then returns it. The code that actually reads the data from the data store is not shown here, because it depends on the data store. Note that the cached item is configured to expire to prevent it from becoming stale if it's updated elsewhere.
通过使用整数ID作为关键字来标识对象。
GetMyEntityAsync方法尝试使用此键从缓存中检索项目。 如果找到匹配项,则返回。 如果缓存中没有匹配项,GetMyEntityAsync方法将从数据存储中检索该对象,将其添加到缓存中,然后返回。 实际上从数据存储器读取数据的代码在这里不显示,因为它取决于数据存储。 请注意,缓存的项目被配置为过期,以防止其在其他地方更新。

// Set five minute expiration as a default
private const double DefaultExpirationTimeInMinutes = 5.0;
public async Task<MyEntity> GetMyEntityAsync(int id)
{
// Define a unique key for this method and its parameters.
var key = $"MyEntity:{id}";
var cache = Connection.GetDatabase();
// Try to get the entity from the cache.
var json = await cache.StringGetAsync(key).ConfigureAwait(false);
var value = string.IsNullOrWhiteSpace(json)
? default(MyEntity)
: JsonConvert.DeserializeObject<MyEntity>(json);
if (value == null) // Cache miss
{
// If there's a cache miss, get the entity from the original store and cache it.
// Code has been omitted because it's data store dependent.
value = ...;
// Avoid caching a null value.
if (value != null)
{
// Put the item in the cache with a custom expiration time that
// depends on how critical it is to have stale data.
await cache.StringSetAsync(key, JsonConvert.SerializeObject(value)).ConfigureAwait(false);
await cache.KeyExpireAsync(key, TimeSpan.FromMinutes(DefaultExpirationTimeInMinutes)).ConfigureAwait(false);
}
}
return value;
}

The examples use the Azure Redis Cache API to access the store and retrieve information from the cache. For more information, see Using Microsoft Azure Redis Cache and How to create a Web App with Redis Cache

这些示例使用Azure Redis Cache API来访问存储并从缓存中检索信息。 有关详细信息,请参阅Using Microsoft Azure Redis Cache and
How to create a Web App with Redis Cache

The UpdateEntityAsync method shown below demonstrates how to invalidate an object in the cache when the value is changed by the application. This is an example of a write-through approach. The code updates the original data store and then removes the cached item from the cache by calling the KeyDeleteAsync method, specifying the key.
下面显示的UpdateEntityAsync方法演示了当应用程序更改值时,如何使缓存中的对象无效。 这是一个写通方法的例子。 代码更新原始数据存储,然后通过调用KeyDeleteAsync方法(指定该键)从缓存中删除缓存的项目。

The order of the steps in this sequence is important. If the item is removed before the cache is updated, the client application has a short period of time to fetch the data (because it isn't found in the cache) before the item in the data store has been changed, resulting in the cache containing stale data.

这个顺序的顺序很重要。 如果在缓存更新之前删除该项目,则客户端应用程序在数据存储中的项目已更改之前,具有很短的时间来获取数据(因为它在高速缓存中未找到),从而导致高速缓存包含陈旧的数据。

public async Task UpdateEntityAsync(MyEntity entity)
{
// Invalidate the current cache object
var cache = Connection.GetDatabase();
var id = entity.Id;
var key = $"MyEntity:{id}"; // Get the correct key for the cached object.
await cache.KeyDeleteAsync(key).ConfigureAwait(false);
// Update the object in the original data store
await this.store.UpdateEntityAsync(entity).ConfigureAwait(false);
}

Related guidance

The following information may be relevant when implementing this pattern:

  • Caching Guidance. Provides additional information on how you can cache data in a cloud solution, and the issues that you should consider when you implement a cache.

  • Data Consistency Primer. Cloud applications typically use data that's spread across data stores. Managing and maintaining data consistency in this environment is a critical aspect of the system, particularly the concurrency and availability issues that can arise. This primer describes issues about consistency across distributed data, and summarizes how an application can implement eventual consistency to maintain the availability of data.

  • Caching Guidance. 提供有关如何在云解决方案中缓存数据的其他信息,以及实现缓存时应考虑的问题。

  • Data Consistency Primer. 云应用程序通常使用分布在数据存储中的数据。 管理和维护此环境中的数据一致性是系统的一个重要方面,特别是可能出现的并发和可用性问题。 本引言介绍了有关分布式数据一致性的问题,并总结了应用程序如何实现最终的一致性来维护数据的可用性。

推荐阅读更多精彩内容