Michał Gruchała - Data sharding

32
Data Sharding Michał Gruchała [email protected] WebClusters 2011

Transcript of Michał Gruchała - Data sharding

Page 1: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 1/32

Data Sharding

Michał Gruchała 

[email protected] 2011

Page 2: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 2/32

TODO

● Background● Theory● Practice

● Summary

Page 3: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 3/32

Background

Microblogging site● user messages (blog)● cockpit/wall

Classic architecture● database● web server(s)

● loadbalancer(s)

Page 4: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 4/32

Background

Web servers, load balancers● one server ● ...● 1000 servers

● not a problem

Database● one database

● two databases (master -> slave)● two databases (master <-> master)● n databases (slave(s)<-master<->master->slave(s))

 a lot of replication ;)

Page 5: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 5/32

Background

Replication ● increase read performance (raid1)● increase data safety (raid1)● does not increase system's capacity (GBs)

Page 6: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 6/32

Background

Scalability

● stateless elements scale well

● stateful elements○ quite easy to scale

■ if we want more reads (cache, replication)○ hard to scale

■ if we want more writes■ if we want more capacity

Page 7: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 7/32

Background

Sharding ;)

A B C D

E F G H

I J K L

A B C D

 

E F G H

I J K L

Page 8: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 8/32

Theory

Page 9: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 9/32

Theory

Scaling● Scale Back

○ delete, archive unuset data● Scale Up (vertical)

○ more power, more disks● Scale Out (horizontal)

○ add machines■ functional partitioning

■ replication■ sharding

Page 10: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 10/32

Theory

Sharding● split one big database into many smaller databases

○ spread rows○ spread them across many servers

● shared-nothing partitioning● not a replication

Page 11: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 11/32

Theory

Sharding key

● shard by a key● all data with that key will be on the same shard

● i.e. shard by user - all informations connected to user are onone shard (user info, messages, friends list)

user 1 -> shard 1

user 2 -> shard 2

user 3 -> shard 1

user 4 -> shard 2

● choosing a right key is very important!

Page 12: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 12/32

Theory

Sharding function

● maps keys to shards● where to find the data

● where to store the data

shard number = sf(key)

Page 13: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 13/32

Theory

Sharding function

● Dynamic ○ Mapping in a database table

● Fixed ○ Modulo

shard number = id % shards_count○ Hash + Modulo

shard number = md5(email) % shards_count ○ Consistent hasing

http://en.wikipedia.org/wiki/Consistent_hashing 

Page 14: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 14/32

Theory

Advantages

● Linear write/read performance scalability (raid0)● Capacity increase (raid0)

● Smaller databases are easier to manage○ alter ○ backup/restore○ truncate ;)

● Smaller databases are faster ○ as may fit into memory

● Cost effective○ 80core, 20 HD, 80GB RAM vs○ 10 x (8core, 2HD, 8GB RAM)

Page 15: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 15/32

Theory

Challenges

● Globally unique IDs○ unique across all shards

■ auto_increment_increment, auto_increment_offset■ global IDs table

○ not unique across shards■ IDs in dbs - not unique■ shard_number - unique

■ global unique ID = shard_number + db ID

Page 16: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 16/32

Challenges

Re-sharding 

● consistent hasingor 

● more shards than machines/nodes(i.e. 100 shards on 10 machines)

1,4,7 2,5,8 3,6,9

1,6 2,7 3,8 4,9 5

Page 17: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 17/32

Challenges

Cross-shard

● queries○ sent to many shards

○ collect result from one○ avoidable (better sharding key, more sharding keys)

● joins○ send query to many shards○ join results in an application○ sometimes unavoidable

Page 18: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 18/32

Challenges

Network

● more machines, more smaller streams● full-mesh between webservers and shards

● pconnect vs. connect

Complexity

● usually sharding is done in application logic

Page 19: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 19/32

Page 20: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 20/32

Practice

Microblogging site● see users messages● see stream/wall

Classic architecture● database● web server(s)● loadbalancer(s)

Page 21: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 21/32

Practice

Data

id login

1 John

2 Bob

3 Andy

4 Claire

5 Megan

id owner message

1 2 M1

2 1 M2

3 2 M3

4 3 M4

5 2 M5

who whose

1 2

3 4

3 2

1 3

5 2

2 1

1 5

4 3

4 1

John's messages?John's follows?

Page 22: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 22/32

Page 23: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 23/32

Practice

 

id login

1 John

2 Bob

3 Andy

4 Claire

5 Megan

id owner  message

1 2 M1

3 2 M3

5 2 M5

who whose

2 1

4 3

4 1

id owner  message

2 1 M2

4 3 M4

who whose

1 2

3 4

3 2

1 3

5 2

1 5

shard0

shard1

mapping?

Page 24: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 24/32

Practice

Bob's blog

● Bob's messages○ find Bob's id in User table (id = 2)

○ find Bob's shard (2%2 = 0, shard0)○ fetch Messages (shard0) where owner = 2

● People Bob follows○ find Bob's id in User table (id = 2)○ find Bob's shard (2%2 = 0, shard0)○ fetch whose id from Follow table (shard0)○ fetch people info from User table

Page 25: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 25/32

Practice

 

id login

1 John

2 Bob

3 Andy

4 Claire

5 Megan

id owner  message

1 2 M1

3 2 M3

5 2 M5

who whose

2 1

4 3

4 1

id owner  message

2 1 M2

4 3 M4

who whose

1 2

3 4

3 2

1 3

5 2

1 5

shard0

shard1

Page 26: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 26/32

Practice

Who follows Andy ?

● find Andy's id in User table (id=3)● find Andy's shard (3%2 = 1, shard1)

● hmmm

Page 27: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 27/32

Practice

 

id login

1 John

2 Bob

3 Andy

4 Claire

5 Megan

id owner  message

1 2 M1

3 2 M3

5 2 M5

who whose

2 1

4 3

4 1

id owner  message

2 1 M2

4 3 M4

who whose

1 2

3 4

3 2

1 3

5 2

1 5

shard0

shard1

Cross-shardquery!

Page 28: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 28/32

Practice

 

id login

1 John

2 Bob

3 Andy

4 Claire

5 Megan

id owner  message

1 2 M1

3 2 M3

5 2 M5

who whose

2 1

4 3

4 1

id owner  message

2 1 M2

4 3 M4

who whose

1 2

3 4

3 2

1 3

5 2

1 5

shard0

shard1

Ideas?

Page 29: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 29/32

Summary

 

Page 30: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 30/32

Summary

Shard or not to shard

● many reads, little writes? - don't● many writes and no capacity problems? - don't (use SSD)

● capacity problems? - yes● many writes and capacity problems? - yes● scale-up is affordable? - don't shard

 As You see... it depends!

Page 31: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 31/32

Summary

If You have to shard

● always use sharding + replication = raid10○ sharding reduces high availability (like raid0)

● more shards than You need○ i.e. 4 machines, 100 shards○ or dynamic allocation

● think of network capacity (full-mesh)○ load sharding (google it ;))

● sharding key - important!○ cross-shard queries

Page 32: Michał Gruchała - Data sharding

8/6/2019 Michał Gruchała - Data sharding

http://slidepdf.com/reader/full/michal-gruchala-data-sharding 32/32

Wake Up!