After long delays and multiple false starts, IRS delivers on a new database to speed up returns processing and delivery of refunds, plus improve fraud detection.

J. Nicholas Hoover, Senior Editor, InformationWeek Government

April 18, 2012

5 Min Read

10 Great iPad Apps From Uncle Sam

10 Great iPad Apps From Uncle Sam


10 Great iPad Apps From Uncle Sam (click image for larger view and for slideshow)

Did you notice anything different about the way your tax return was handled this year by the IRS? Using a new database management system, the agency is processing most returns daily rather than once a week, as in the past. IRS Commissioner Doug Shulman says the new system enables faster processing of returns and better customer service, while cutting down on fraud.

The system, called Customer Account Database Engine 2 (CADE 2), was almost 25 years in the making. Shulman, speaking to InformationWeek on April 17, deadline day for 2011 tax returns, called the CADE 2's deployment a milestone for the IRS, which had come under sharp criticism for previous efforts to modernize its tax-processing systems.

"What CADE 2 does is allow the data to get processed more quickly," Shulman said, adding that the greatest benefit for taxpayers will be faster refunds. "There are a lot of taxpayers struggling. People depend on the refunds as a major cash infusion to help with food, healthcare, and housing."

The agency moved to daily processing of tax returns in January after more than five decades of weekly batch processing. The old tape-based system, the 1960s-era Individual Master File, was expensive to maintain and made it difficult to answer tax payer questions about the status of tax returns because of the lag between when their documents were received and processed. The mainframe-based Individual Master File and its related processes have been obsolete for decades. Plans to migrate to a relational database management system go back to 1988.

"The code was well written, but it was sequential processing, assembly language code. It had data from the '60s until the current time, and there were all these limitations where we had to build other legacy systems around the defects," IRS CTO Terry Millholland said in an interview. "We had a ton of legacy environments with very complex processing."

CADE 2 runs IBM database software and uses Informatica tools for data extraction and transformation. The system's promised benefits include faster processing, the ability to use analytics for customer service and fraud prevention, lower maintenance costs, and improved security. The performance targets for CADE 2 include processing 90% of transactions within two days and reducing the data error rate by 5% below the 2011 baseline.

More refunds are getting to taxpayers in timely fashion this tax-filing season, according to the IRS, which reports that about 60% of taxpayers who filed electronically have received refunds in eight days or less, up from 30% last year. CADE 2 almost didn't happen. Following schedule delays and budget overruns to earlier versions of the system, the agency scaled back its plans and extended its timeline for delivery, with plans stretching into the 2020's. As a result, CADE 2 deals only with individual taxes, not business taxes or those related to retirement plans. But the project completion date was accelerated, and daily processing went live in January, in time for this year's tax season.

The agency began moving the processing of simple returns like the 1040EZ form to CADE 2's predecessor system, a relational database developed with help from CSC. When Shulman took on the commissioner job in 2007, he made it a priority to push the project through to completion. "We took the IT portfolio and shut down some other projects," he said. "We put the A-team on it."

In November, 2008, the IRS hired CTO Terry Millholland, a former tech exec with Visa, Boeing and EDS, to oversee its tax systems, including the CADE project. The agency created a new governance plan for the program, headed up by an associate CIO and overseen by multiple oversight boards.

The IRS functioned as systems integrator for CADE 2, rather than contract out that job. Millholland's strategy has been to "get the data right, and the functionality will follow."

That meant creating a data model for 30,000 data elements, then extracting all the structured and unstructured data from the Individual Master File, converting it, and loading into CADE 2. And data integrity had to be such that "it balances to the penny," said Millholland.

In its budget request for fiscal year 2013, the IRS cited phase two of CADE 2 development as among its planned areas of investment. The agency is looking to retire the IMF and rewrite large chunks of machine code in Java. And it plans increased use of analytics tools with CADE 2 to support its compliance efforts. It already applies filters to scour tax returns for inaccuracies or fraud.

IRS watchdogs are keeping an eye on the project. Last September, the IRS agreed to take steps to improve the project's management, after the inspector general complained that the agency wasn't consistently implementing system development practices, that too many risks were undocumented, and that test plans were insufficiently developed.

The agency faces other tech challenges. Increased call volumes have led to a 48% increase in call wait times since 2008, a problem that will likely require new approaches to customer service, self service, and automation. And the IRS is still tweaking its electronic filing systems, which have experienced data transmission problems at times.

Attend InformqtionWeek's IT Government Leadership Forum, a day-long venue where senior IT leaders in government come together to discuss how they're using technology to drive change in federal departments and agencies. It happens in Washington, D.C., May 3.

About the Author(s)

J. Nicholas Hoover

Senior Editor, InformationWeek Government

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights