Carfax Selects MongoDB To Drive 11 Billion Records

Vehicle-history service switches to open source, NoSQL database with an eye to exploring its massive data set in new ways.

Ellis Booker, Technology Journalist

March 8, 2013

2 Min Read
InformationWeek logo in a gray background | InformationWeek

There's a 30-year-old relational database up on blocks at Carfax's Columbia, Mo., office.

On Tuesday, the Web service, which supplies used-vehicle history reports to millions of consumers and 30,000 dealerships every year, announced plans to retire its VMS-based RDBMS and switch to MongoDB, the open source, document-oriented database developed and supported by 10gen.

"VMS has been a very valuable OS for us," Carfax CTO Joedy Lenz told InformationWeek in a phone interview. "Unfortunately, with our data volumes, it became fairly expensive to operate and maintain." The production VMS system will be retired within 12 months, he said.

Carfax's Vehicle History Report, created in 1986, is the largest vehicle-history database ever assembled, with nearly 11.5 billion records and growing at 1 billion new records a year. It comprises information from more than 75,000 sources, such as U.S. and Canadian motor vehicle departments, service and repair facilities, insurance companies, and police departments.

[ For more on database vendors, see InformationWeek's Big Data 101: New Vendor-Neutral Guide. ]

When it takes over the driver's seat, the MongoDB will run across 50 servers. Lenz declined to name the hardware vendor. But 10gen CEO Max Schireson told InformationWeek on the phone: "Using inexpensive commodity servers means they can scale out," Schireson said.

While an open source product, 10gen claims some 500 customers worldwide who pay for its consulting and services. This customer list includes marquee Web brands like eBay and Craigslist, but traditional businesses as well, including three of the top 10 global banks and telcos, among others.

Another advantage of using MongoDB is its built-in redundancy. If a node fails, work is picked up by one or more secondary nodes.

In fact, Carfax already uses a seven-node VMS system. However, Lenz shared that in early performance testing, MongoDB ran transactions up to four times faster. But speed and cost savings weren't the only reasons Carfax decided to migrate to a NoSQL architecture.

Unlike their relational predecessors, NoSQL databases like MongoDB, Cassandra and Riak use a flexible, schema-less design that is especially well suited for massive amounts of variable data.

"Mongo does [transaction processing] with the added benefit of analytics and data mining," he said. "The sky's the limit ... we're just scratching surface."

As NoSQL products like MongoDB win new adherents, relational database vendors haven't been sitting still. Just last month, Oracle announced a major upgrade, MySQL 5.6, which includes features for high-scale deployments. For example, Oracle announced it would support direct access to data through the Memcached API, which is up to nine times faster than accessing data through SQL parsing.

About the Author

Ellis Booker

Technology Journalist

Ellis Booker has held senior editorial posts at a number of A-list IT publications, including UBM's InternetWeek, Mecklermedia's Web Week, and IDG's Computerworld. At Computerworld, he led Internet and electronic commerce coverage in the early days of the web and was responsible for creating its weekly Internet Page. Most recently, he was editor-in-chief of Crain Communication Inc.’s BtoB, the only magazine devoted to covering the intersection of business strategy and business marketing. He ran BtoB, as well as its sister title Media Business, for a decade. He is based in Evanston, Ill.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights