Greenplum is announcing today a long-term vision, under the name Enterprise Data Cloud (EDC). Key observations around the concept -- mixing mine and Greenplum's together -- include:
In essence, Greenplum is pitching this story:
When put that starkly, it's overstated, not least because
Specialized Analytic DBMS != Data Warehouse Appliance
But basically it makes sense, for two main reasons:
Of course, the EDC vision isn't quite as new or differentiated as Greenplum ideally would wish one to believe.
One particular source of potential confusion is Greenplum's emphasis on the buzzphrase self-service (data mart). This seems to be a conflation of two related concepts:
One thing that's needed for this technology to come to full fruition is sophisticated data movement and synchronization. Ideally, some tables in a data mart could be virtual -- views against a central database. But others would be physically recopied from the center, with all the ETL / ELT / ETLT / replication issues that entails. Meanwhile, it's not obvious that the ideal architecture is a simpleminded hub-spoke -- perhaps one should be able to spin data marts out of other marts, perhaps at least somewhat reducing the proliferation of tables and the recopying of data. And it should be easy for administrators to change deployment strategies, e.g., by starting a table out as a view and changing over to making it a physical copy as usage profiles change.
Oliver Ratzesberger of eBay also argues that workload management -- not a current Greenplum strength -- can be crucial. For example, if the CEO wants the CFO to get her an answer TODAY, the fastest approach may be to create an entirely virtual data mart, with very favorable SLAs (Service Level Agreements). More generally, if you're setting up dozens of marts that contain views of the central database, sophisticated SLA management can be essential. There's a big virtualization opportunity here -- but virtualization requires a lot of system management infrastructure.