Amazon Web Services has re-architected the widely used MySQL open source database system into a database optimized for use as a cloud service. Its Aurora database service, first announced in mid-November, is now generally available.
The relational database service, frequently used on websites and for mobile applications, sits at the heart of the Aurora relational system service. But Amazon has broken it down from a single system into its core parts and staged those parts as semi-autonomous operations that can expand or contract as needed on its cloud infrastructure.
Aurora, for example, relies on Amazon's S3 storage for its ability to capture data at scale, preserve the data in the face of potential system failure, and keep it constantly available to a running production system. To do so, its spreads two copies of each customer's data set across three availability zones, making it, in effect, a data center unit with its own power and communications.
One availability zone may fail, but it's unlikely two or more will fail at the same time. That means that there are a total of six copies of each customer's data are available and that there are "eleven nines of data durability," meaning 99.999999999% guaranteed data availability, said Matt Wood, general manager of AWS product strategy.
Such a level of durability would be much harder to achieve on a large server-based commercial database system. Amazon also offers it at what Wood described as "one-tenth the cost" of most enterprise database systems.
[Want to learn about Aurora's beginnings? See Amazon Focuses On New Services, Not Price.]
"How database systems have been designed and built has remained the same for 40 years," Wood said. Although Amazon has previously offered MySQL as a software option among its relational database services, Aurora is MySQL as an on-demand service that can scale to almost any customer need, he said. "Anything you can do with MySQL 5.6 you can do with Aurora."
For many MySQL users, there will be things they can do with their MySQL apps in the cloud that they couldn't previously do on their own systems.
For example, Aurora can scale to capture a large influx of data or to handle a large number of queries. Aurora has a read-replication capability that allows it to spin a replication to serve a particular batch of queries. The replication remains synchronized with changes in the core database, but it fields the traffic related to a particular query, avoiding any disruption of the main system's operation.
Such a process would be useful for a database system that will face heavy reporting demands at certain times of the day or month. It would allow those demands to be met as it continues to do core transaction processing and updates.
Another capability is to scale the front-end data caching as separate from the rest of the database system. The cache has been given its own semi-autonomous status, which allows it to expand or contract. It also allows it to continue to exist, even if the core logic engine goes down. As a new copy is fired up, the cache system doesn't need to be renewed. It's already running and waiting for the connection, which speeds time to recovery, Wood said.
Likewise, the storage engine is separate and relies on S3. It's designed for high performance when fielding large numbers of queries.
Asked how many members of the Fortune 500 have tested Aurora as a candidate for their transaction systems, Wood said, "I don't think we have those numbers available."
But Amazon cited Earth Networks, a weather observation and reporting service, as a user that migrated its Microsoft SQL Server data systems to Aurora. It processes 25 TB of real-time data daily, according to its lead architect, Eddie Dingels, as stated in Amazon's announcement. Dingels said his firm was "very impressed with how easy it was to move from our current SQL Server databases to Aurora with only a few changes."
Edward Wong, solutions architect for California utility Pacific Gas & Electric, said his firm "can run many replicas with millisecond latency. This means during a power event we can handle large surges in traffic and still give our customers timely, up-to-date information." PG&E also likes the guarantee that data for customer feedback on a large scale will be there when it's needed.
Content management firm Alfresco was also cited in the announcement. Founder and CTO John Newton said, "[W]e scaled to 1 billion documents with a throughput of 3 million per hour, which is 10 times faster than our MySQL environment" on-premises.
Amazon is claiming Aurora will outperform an on-premises MySQL system by a more modest margin of five to one, Wood said.Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive ... View Full Bio