LCache DrupalCon Dublin 2016
-
Upload
david-timothy-strauss -
Category
Engineering
-
view
115 -
download
0
Transcript of LCache DrupalCon Dublin 2016
Intelligent, Tiered, ScalableCaching with LCache
1
Existing Cachinng Challenges
2
Pantheon.io
Traditional Web Caching
3
Redis or Memcache
Cache Traffic
Web Server
Web Server
Web Server
Web Server
Bottleneck
Pantheon.io
The Anatomy of a Bottleneck
4
Pantheon.io
Scaling Traditional Web Caching
5
Redis or Memcache
Cache Traffic
Web Server
Web Server
Web Server
Web Server
Redis or Memcache
● Use replication?
○ Failover issues
○ Replication lag
or slow writes
● Use sharding?
○ Consistency issues
● Still network-bound
Proudly Designed Elsewhere:Employing Known Solutions
6
Pantheon.io
Existing Solutions: Multi-Core Processors
7
Pantheon.io
Writes
Existing Solutions: Pantheon’s Valhalla
8
Application Container
File Mount
Cache
Application Container
File Mount
Cache Application Container
File Mount
Cache
File Server
File Server
File Server
Events
Pantheon.io
Row Changes(No SQL)
SQ
L
Existing Solutions: MySQL Row Replication
9
MySQL Primary
Application
MySQL Replica
shell> mysqlbinlog -vv log_file...# at 302#080828 15:03:08 server id 1 end_log_pos 356 Update_rows: table id 17 flags: STMT_END_F
BINLOG 'fAS3SBMBAAAALAAAAC4BAAAAABEAAAAAAAAABHRlc3QAAXQAAwMPCgIUAAQ=fAS3SBgBAAAANgAAAGQBAAAQABEAAAAAAAEAA////AEAAAAFYXBwbGX4AQAAAARwZWFyIbIP'/*!*/;### UPDATE test.t### WHERE### @1=1 /* INT meta=0 nullable=0 is_null=0 */### @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */### @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */### SET### @1=1 /* INT meta=0 nullable=0 is_null=0 */### @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */### @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
Keeps Replication
Simple!
Pantheon.io
“Because it’s faster, of course.”
● Inspired by multicore processors
⌾ Get the working set close to the work
⌾ Trade some write performance and scale for massive read gains
⌾ Hide the coherency management
● Inspired by Pantheon’s Valhalla file system
⌾ Write-through: clients can leave at any point
⌾ Incremental changes freshen the local cache
⌾ Only as read-after-write consistent as it needs to be
● Inspired by MySQL row-based replication
⌾ Materialize complex tag deletion on the primary instance
and only replicate the key-based changes
10
Pantheon.io
Contrast: ChainedFastBackend
11
LCache ChainedFastBackend
Beginning of Request
Synchronizes cache writes and bin/key invalidations. One SELECT query.
Updates bin invalidation data.One SELECT query.
Read Key Reads local cache. If no key does not exist in the local cache, reads consistent cache.No query if hitting local cache.
Reads from local cache.No query if hitting local cache.
Write or Invalidate Key
Writes to local and consistent caches.One INSERT query.
Writes to local and consistent caches.Invalidates entire bin in all local caches.Up to two queries per write.
Invalidate Tag Writes to consistent cache and generates key invalidations. Multiple queries.
Writes to consistent cache.Invalidates entire bin in local caches.
End of Request Garbage-collects deletions.Executes one batched DELETE query(if cache writes have occurred) after request closes.
No activity.
Challenges and Solutions
12
Pantheon.io
Unexpected Issues
● Sites write to caches very often
⌾ Seeing 10-40 cache “sets” per page
⌾ LCache’s “sets” are expensive (transactional database plus replication to clients)
⌾ Most modules assume a miss is a good reason to “set.”
⌾ Some cache items are “set” more than “get.”
● Using tags for bins was not fast enough
⌾ Relational model created too much overhead
⌾ Materializing the clearing of a whole bin wasn’t efficient (replicated many, many changes)
⌾ Moved to native bin support
13
Pantheon.io
Write Models to Optimize the “Set” Path
14
Low Splay(each write to random choice of 64 keys)
High Splay(each write to random choice of 4096 keys)
10 Processes ✕ 40 Writes Each
Winner here!
And notworse here!
Pantheon.io
Machine Learning: Avoiding Useless “Sets”
15
Loading iterator...Iterating...Array( [lcache:10.223.176.176:18341:5:cache:environment_indicator] => 5634 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:taxonomy_term] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_8] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_1] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_2] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_3] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:calendar] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_5] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:redirects] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:backlinks] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_7] => 3037 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_6] => 3036 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:frontpage] => 3036 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:nodequeue_4] => 3036 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:agency_search] => 3036 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:glossary] => 3036 [lcache:10.223.176.176:18341:11:cache_views:ctools_export:views_view:campaign] => 3036
LCache starts
ignoring at 100 now
Pantheon.io
Configuration: Assigning Bins and Keys
● Better with LCache
⌾ Frequently read
⌾ Rarely written
⌾ Large
● Worse (or not ideal) with LCache
⌾ Read once or not at all (e.g. form cache should use normal database cache backend)
⌾ Things handleable earlier in the stack (e.g. Varnish instead of Drupal’s page cache)
⌾ Keys updated often (partly mitigated with machine learning)
⌾ Clearing 100+ keys with a tag (because of replication)
16
Built for Reliability
17
Pantheon.io
Test-Driven Development
18
Pantheon.io
Composer-Based Library
19
Pantheon.io
Lightweight Adapters for Frameworks
● Stateless
● Composer inclusion of the LCache library
● Modules and extensions
⌾ Drupal 7 module
⌾ Drupal 8 module
⌾ WordPress drop-in
● Drupal 8.3+ core?
20
Performance and Scalability
21
Pantheon.io
Comparing Against Redis: Performance
22
Pantheon.io
Comparing Against Redis: Concurrency
23
Pantheon.io
Going Live: Performance
24
Pantheon.io
Going Live: Impact on Databases
25
Next Steps
26
Pantheon.io
Further Performance Improvements
● Try mysqli with asynchronous queries for the initial synchronization.
⌾ Upside: No synchronous wait on obtaining events.
⌾ Downside: Yet another database connection.
● Synchronize (again) in the destructor after the request closes.
⌾ Upside: Potentially handles some events without users waiting.
⌾ Downside: Additional database queries.
● SQLite L1 cache
⌾ Upside: Persists across PHP-FPM restarts. Useful with CLI.
Cache can be larger than memory.
⌾ Downside: Slower writes. Possible lock contention.
27
Pantheon.io
Ambitions for Core
● ChainedFastBackend isn’t going to cut it.
⌾ Not usable for most cache bins.
⌾ Administrators need to carefully choose when to introduce it.
⌾ Degrades rapidly on cache writes.
● Even just the LCache L2 component is faster than Drupal’s built-in caches.
⌾ INSERT-only model is a big win.
⌾ LCache can use a Null L1 seamlessly.
● Relying on Composer-based libraries is widespread in Drupal 8.
● A default cache for most bins
28
Pantheon.io
PSR-6 and PSR-16
● PSR-6
⌾ No concept of cache tags, an essential part of Drupal 8 caching.
⌾ No concept of retrieving invalidated items.
(Not supported in LCache yet, but supported by Drupal 8.)
⌾ Interesting concept of deferred persistence.
● PSR-16
⌾ Counter interface wouldn’t be consumed by Drupal 8 (but would be by WordPress).
⌾ Mostly built on PSR-6.
29