An Inside Look at Sears' Big Data Investment

Sears Holdings has faced many challenges that are typical to any large enterprise: difficulty meeting production schedules and service level agreements (SLAs), multiple copies of data with no single version of truth, ETL (extract, transform, load) complexity and the cost of the software needed to manage, data latency, enterprise data warehouses unable to handle the load, mainframe over workload capacity, and escalating costs.

The retailer wanted to build an agile enterprise that would be nimble, quick and operate at the speed of business. The transformation from legacy and proprietary technologies to a cloud-based, open-source big data platform allows for nimbler systems that enable the business at materially lower costs. To do this, Sears worked with several different solutions before it selected Hadoop.

“That’s what triggered us to actually look into a different solution,” says Aashish Chandra, divisional vice president of Sears Holdings Corporation and GM of MetaScale. “And Hadoop wasn’t the first. We weren’t so lucky to come across Hadoop on the first go. We made mistakes on our way and eventually looked into Hadoop, less than three years ago, with a small proof of concept, including just eight nodes. Before we knew it we found it to be the answer to a lot of questions that we were unable to solve. That’s the reason we started to look deeper into Hadoop.”

The retailer credits its use of the Hadoop system for providing a database that can affordably store data in one place, apply tools to it and allow the retailer to consume it in an easy way. For Sears, Hadoop is the new mainframe, and it is treated and governed like one.

Want to publish your own articles on DistilINFO Publications?

Send us an email, we will get in touch with you.

Although Hadoop has been around and used for analyzing unstructured data for some time, the retailer has become a leader in leveraging the platform for structured data in traditional enterprises to eliminate ETL complexity, data latency, costs and much more.

Sears not only simplifies and modernizes the code to make it easy to maintain, but it also improves performance and significantly reduces costs by reducing workload on mainframes or data warehouses. Even the way Sears handles archiving has changed. The retailer was so inundated data that it could only store a small portion of it. Now, the retailer grabs as much data as possible and holds onto that data forever with Hadoop Distributed File System (HDFS) and can now run queries on that data to analyze and report on it.

“The business is moving to a real-time world, wanting information now, so that better and fact-based business decisions could be made with crucial insight,” says Chandra. “To enable the business, we needed to move at the speed of business so that business can stay competitive and for that, batch-processing wasn’t a viable option anymore.”

Date: September 3, 2013

Source: RIS