An Introduction to Google Cloud Spanner Part 1
Google cloud spanner is a globally distributed relational database service for massively scalable applications. It employs automatic sharding (or splits) and replication to scale up to millions of nodes and trillions of database rows and promises 99.999% availability.
Cloud Spanner combines the scalability of a NoSQL database with qualities of relational databases, including schemas, ACID.
Main building blocks of Spanner
1. Google’s network infrastructure – Provides global connectivity
2. TrueTime – Provides high accuracy and availability
3. Optimized Software Stack – Automatic resharding and Paxos implementation
Features:
High availability
Cloud spanner provides multi-regional deployments with 99.999% availability SLA and single regional deployments with 99.99% availability SLA. Administrative functions such as replication and sharding are managed without requiring database outages.
Instances
Instances can be either regional or multi-regional. Multi-regional makes use of TrueTime (a globally distributed clock with high accuracy and availability) to provide global consistency and higher availability.
Security
Cloud spanner provides Enterprise-grade security. It provides security through IAM integration with permissions and access configurable for groups and users at the instance and database level.
Replication
It is used for geographic locality and global availability. Transactions are replicated using a Paxos distributed consensus protocol to ensure transactions are available in multiple replicas before being committed.
ACID Transaction and SQL Query Support
Google Spanner supports ACID transactions and SQL Specs.
Architecture of Spanner
Spanner provides minimum 3 shards per region and each shard will be in each zone. A shard is known as splits. When we create a cluster with 1 node, 2 more nodes are created in other zones. Paxos protocol allocates a main node at a time and others nodes are minions. Paxos helps to maintain the minimum number of nodes and select a new leader during any breakdown.
High accuracy and availability with TrueTime
How does spanner provide strong consistency across all nodes? The Answer is TrueTime!!
Its not a good idea to explain Spanner without diving into TrueTime.
TrueTime is highly available, distributed clock that is provided to applications on all Google servers. It enables applications to generate monotonically increasing timestamps. The main hardware is built with Atomic clocks to maintain the time.
During each write operation spanner takes the TrueTime values and provided to the leader nodes. Then, the date is replicated to other nodes. Therefore, the timestamp is same on all nodes.
Database replication
One of the major issues the spanner has to address is database replication on a global scale i.e., data should be consistent when multiple users perform transactions across the globe. The spanner team at google eliminates by using Paxos protocol – A distributed consensus algorithm (An algorithm that ensures that a change you make on a file is propagated to all of its replicas).
When to choose Spanner?
Let’s take a close look on below flowchart to understand the importance of cloud Spanner
Cloud Spanner in a Nutshell
Google cloud Spanner is a cloud native, enterprise grade and fully managed relational database that offers high availability, horizontal scalability with consistent global ACID transactions. It delivers high-performance and strong consistency across rows, regions, and continents with industry leading 99.999% availability SLA.