IBM Weighs In: Information Wants To Be Expensive

Data Management

I was quite disappointed to discover that IBM has cut off free access to historical IBM Journal articles. Decades of valuable, industry focused computing literature is now behind a "paywall," material that establishes the foundations of BI, data warehousing, text analytics, and more. For what? IBM earned $12.3 billion on sales of $103.6 billion in 2008. Let's look at what we in the data business have lost...

Seth Grimes, Contributor

April 19, 2009

3 Min Read

I was quite disappointed to discover that IBM has cut off free access to historical IBM Journal articles. Decades of valuable, industry focused computing literature is now behind a "paywall." Gone is open availability of seminal material that establishes the foundations of business intelligence, data warehousing, text analytics, and more. For what? IBM earned $12.3 billion on sales of $103.6 billion in 2008. If IBM makes $1 million yearly selling journal access, that would represent less than one-hundredth of one percent of annual profit. Let's look at what we in the data business have lost.Consider three papers.

In an October 1958 IBM Journal article entitled "A Business Intelligence System," Hans Peter Luhn articulated a vision for management of knowledge extracted from business sources. It is BI's ur-document as I describe in my own "BI at 50 Turns Back to the Future," and it used to be freely available.

Another Luhn article, "The Automatic Creation of Literature Abstracts," published in April 1958, describes exploratory research to automate analysis of text. "Statistical information derived from word frequency and distribution is used by the machine to compute a relative measure of significance, first for individual words and then for sentences. Sentences scoring highest in significance are extracted and printed out to become the 'auto-abstract'."

This article and others from the era lay out foundations for statistical text analytics. (Fortunately you can still find this particular one on-line.)

And you can find the first written use of the term "data warehouse" (that I know of) in a 1988 IBM Systems Journal article by Barry Devlin and Paul Murphy of IBM Ireland... if you are able and willing to pay for access. Their article, "An Architecture for a Business and Information System," defined a "business data warehouse" as "the single logical storehouse of all the information used to report on the business."

Devlin wrote me last February, in an exchange exploring early data warehousing, "the internal work on defining the data warehouse architecture in IBM began in 1985, and we certainly didn't build on anything [another, later author] wrote." The 1988 paper formalized this work, laying foundations for a multi-billion dollar industry that has delivered untold enterprise business value over the last two decades.

These three articles and many, many more deserve to be free. In an era tending to open source, IBM, highly profitable IBM, is stooping down to pick up loose change from the floor.

IBM is of course not alone in charging for computing content. Unlike the Association for Computing Machinery or the IEEE Computer Society or a publisher, however, IBM is neither a membership association nor a company that subsists on content-derived revenue. It is inconceivable that IBM would ever garner more than vanishingly small proportion of overall revenues from journal-access payments.

So here we have an affirmation of Stewart Brand's 1984 observation, "On the one hand information wants to be expensive, because it's so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time. So you have these two fighting against each other."

And it's a shame.I was quite disappointed to discover that IBM has cut off free access to historical IBM Journal articles. Decades of valuable, industry focused computing literature is now behind a "paywall," material that establishes the foundations of BI, data warehousing, text analytics, and more. For what? IBM earned $12.3 billion on sales of $103.6 billion in 2008. Let's look at what we in the data business have lost...

About the Author(s)

Seth Grimes

Contributor

Seth Grimes is an analytics strategy consultant with Alta Plana and organizes the Sentiment Analysis Symposium. Follow him on Twitter at @sethgrimes

See more from Seth Grimes

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

About the Author(s)

Editor's Choice