Cornell University Integrates Data Warehouse for Improved Automation
Founded in 1865, Cornell University is a privately endowed research university and a partner of the State University of New York. Ranked in the top one percent of universities in the world, Cornell is made up of 14 colleges and schools serving roughly 22,000 students. Students yield a wealth of data, which grows with the number of services they access while enrolled in the university.
In order to manage this data, schools rely on a number of data-driven programs, including both SaaS and on-premises CRMs and other software. Departments are often limited to the analytics available within each software’s reporting capabilities. For IT teams, providing analytics across systems to get a fuller picture of the university can be extremely challenging.
For Cornell, data integration and visibility across systems is paramount. According to Jeff Christen, IT lead for developing and operating Cornell’s data warehouse, “By breaking down data silos through the integration of multiple subject areas, Cornell gains much greater insight into their enterprise operations.” Cornell, which has been a WhereScape customer for five years, began this data warehouse conversion to reduce the impact on its users from managing large data sets of student records, contributor records and employee records, all of which were continuing to expand rapidly.
The challenge
Cornell University had been using IBM Cognos ETL Data Manager to transform and merge data into an Oracle data warehouse. However, when IBM announced support for the product would be discontinued, the university saw an opportunity to find and implement the more flexible solution it truly required.
In putting together its search criteria to replace its ETL (extract, transform and load) tool, the team identified the benefits automation software could potentially offer. The team also wanted an alternative way to format data into dimensional models and perform data transformations, and an open, metadata-based solution for analytics and reporting data lineage-type extractions. Lastly, the team needed a tool to ramp up internal developers quickly and, because they were dealing with a large customer base and multiple departments, it was extremely important to avoid potential outages.
The new system would focus on migrating student data first - one of the largest data sets - selected because it impacted the largest customer base across the campus. The data warehouse is regularly accessed by nearly 2,000 people across the university in multiple departments including enrollment, admissions, the bursar’s office, billing and other colleges.
The solution
During its search, Cornell’s IT team encountered WhereScape, a leader in data infrastructure automation, at a higher education data warehousing conference. The team was introduced to WhereScape® RED, an integrated development environment that automates the development, deployment and operation of data infrastructure on leading data platforms. WhereScape automation helps teams get to production faster, with less cost and risk, regardless if the organization’s data infrastructure is on-premises, in the cloud, or a hybrid of both.
Quickly, the Cornell team saw the potential WhereScape automation offered to its organization and, as part of a proof of concept, used WhereScape RED to tackle one of its core data model load processes within its financial data mart. The team quickly identified WhereScape’s ease of use. Additionally, the licensing model was a good fit for the university because pricing was by the seat and not by CPU, which would have been cost prohibitive. Fitting its key search criteria, Cornell selected WhereScape as its data warehouse automation solution and a replacement for Data Manager.
“From day one, our initial impression was that WhereScape was easy to use and would present a very low learning curve for the team,” stated Christen.
Cornell evaluated other tools on the market, but WhereScape proved to be the best fit for Cornell’s unique needs. According to WhereScape director of services, Chris Stewart, “A lot of packaged solutions offered today are really bolt-on analytics tools. Pre-set or canned analytics that are often sold as a “one size fits all” solution. In our experience, higher education organizations often require more customization and analytics to reflect their unique data environments than these tools can offer – and without automation, it’s too hard to create and run the analytics you need.”
The results
In addition to migrating subject area data marts from Data Manager to WhereScape RED, Cornell began using WhereScape automation to develop new data marts.
One of the first projects was to quickly develop and deploy a new subject area for tracking employee position history from Workday, its human resources system. The project resulted in improved performance and provided the team with new access to metadata to aid in future work.
“Performance impacts were huge. The student warehouse previously had a nightly refresh window that took up to 10 hours, causing business operational issues as the morning team needed to access the data. The same refresh now takes 4.5 hours, allowing us to meet service level agreements.”
By selecting WhereScape, Cornell was able to manage the data warehouse migration in smaller sub-projects rather than as one large migration in order to minimize the daily impact on its users. If the team would have selected another product that required the move in one large migration, more consulting help would have been needed and costs would have escalated.
Knowledgeable of the results that WhereScape automation provides its own IT team, Cornell now shares the value of data infrastructure automation with its students. “WhereScape automation is now a part of the curriculum in the business intelligence systems course. Within the hands-on education we provide surrounding dimensional data modeling concepts, WhereScape RED is now leveraged for ETL and data visualization, and used within student team projects on campus,” stated Christen. Cornell will continue its rollout of WhereScape automation across campus and looks forward to its next deployment.
About the Author: Neil Barton is the Chief Technology Officer for WhereScape, the leading provider of data infrastructure automation software, where he leads the long-term architecture and technology vision for the company's software products. Barton has held a variety of roles over the past 20 years, including positions at Oracle Australia and Sequent Computer Systems, focused on Software Architecture, Data Warehousing and Business Intelligence. Barton is a co-inventor of three US patents related to Business Intelligence software solutions.
Edited by Maurice Nagle