Kinetica Uses GPUs for Advanced In-Database AnalyticsKinetica Uses GPUs for Advanced In-Database Analytics
GPU-accelerated databases aren't new, and they're not all that popular, but that will likely change over time. Here are some of Kinetica's latest improvements and why that matters to big pharma company GSK.
Kinetica, the in-memory, GPU-accelerated database, now enables in-database analytics via user-defined functions (UDFs). The capability makes the parallel processing power of graphics processing units accessible to custom analytics functions deployed within Kinetica. With the enhancements, machine learning and AI libraries such as TensorFlor, BiDMach, Caffe, and Torch can run in-database alongside of and converged with BI workloads.
Business analysts and developers can leverage Kinetica's APIs to leverage third-party code, use GPUs without having to move data, and execute and manage customizations for custom libraries. They can also use native API bindings in C/C++ and Java, as well as use arbitrary binaries to receive table data, do arbitrary data computations, and save output to a global table in a distributed manner.
{image 1}
Kinetica sits between the system of record such as an Oracle or SQL database and applications, storing tens and even hundreds of terabytes in-memory and on disk to deliver faster answers to pressing questions.
Kinetica now also enables real-time data discovery with its new "Reveal" data exploration framework. With it, business analysts can visualize and interact with billions of data elements instantly without knowing SQL. It uses a RESTful end point as the native connection to make data easier to get in and out.
"It's not just ODBC and JDBC only like most databases. You have a host of APIs that help reduce the middle layer," said Eric Mizell, VP, Global Solution Engineering at Kinetica, in an interview, where he discussed the company's introduction today of an integrated BI/AI/ML platform.
VRAM Boost Mode is another enhancement. With it, users can prioritize their data tables and force datasets to remain in a very fast cluster-wide GPU Video Ram (VRAM) for faster query performance.
GSK Uses Kinetica for Simulation
GlaxoSmithKline (GSK) uses Kinetica to accelerate simulations of chemical reactions.
"What you want to do is distribute that over a large number of nodes and let each node [process] its piece of the simulation. By doing the subdivision, you can do simulation faster," said Mark Ramsey, chief data officer at GSK, who invented and patented a similar idea for IBM around 1998.
GSK is also importing learnings from the Open Targets knowledgebase created by the European Bioinformatics Institute (EBI) and the Sanger Institute and combining that with its own historical data about the experiments GSK has run against targets, its clinical trials and how all of that relates.
"The goal is for us to be able to select a potential target but have a much deeper knowledge base from a variety of dimensions around that target," said Ramsey. "We have a much better ability to select ideal targets."
Similarly, GSK has a relationship with the National Cancer Institute and the U.S. Department of Energy. It will eventually operationalize the learnings from that undertaking as well.
"One of the things I like about Kinetica is it gives us more of a general-purpose use of the technology," said Ramsey. "There has been a lot of software created to answer certain questions [but] highly specialized tools have limited functionality and are tuned to do a certain workload."
To that point, one the things Kinetica can do is make its GPU database look like a relational database so users can interact with it using a traditional language such as SQL. That way, GSK can run an analysis in a traditional relational environment and then in Kinetica if the workload requires a more computationally-intensive environment, without having to do a lot of tooling and rework.
Do YOU Trust GPU Databases?
GPUs and acceleration are synonymous, but not everyone is sold on GPU-accelerated databases yet. What's your take and what would you look for in a GPU-based in-memory database?
About the Author
You May Also Like