MarkLogic's CEO on Healthcare.gov and dueling with Oracle

Open enrollment on HealthCare.gov, the federal database through which US citizens can sign up health care, begins for the fourth year on November 1.

The website’s launch was infamously botched in 2013, and the government recovered by moving part of the massive endeavor from Oracle to MarkLogic, which offers a non-relational, document-based databases. MarkLogic’s CEO Gary Bloom says it’s not uncommon for his company’s customers to come to them with a project originally started on an Oracle database.

Still, the drama around health care databases continues: Just a few weeks ago, the state of Oregon announced it was settling its years-long legal battle with Oracle over its health care marketplace. When Oregon gave up on its state-run database back in 2014, it turned to the federal marketplace.

Bloom’s career has intersected with Oracle at multiple points. Bloom was an executive president at Oracle from 1986 to 2000. He was also on the board of Taleo, which was later acquired by Oracle. Before coming to MarkLogic, Bloom was CEO of Veritas, acquired by Symantec, and eMeter, acquired by Siemens.

ZDNet caught up with Bloom ahead of open enrollment — and after Oracle CTO Larry Ellison’s fiery takedown of Amazon at Oracle OpenWorld — to discuss the future of databases.

This conversation was edited for brevity:

Is healthcare.gov a unique case, or does offer insight into bigger market challenges?

I don’t think what MarkLogic accomplished with Obamacare is any different from what enterprises around the globe are facing right now. They need to build their databases to integrate their silos of data.

For Oregon, the only system Oracle was trying to build was the database. When we talk about integrating all the sources of data– things like IRS, immigration and credit data on citizens — that was all part of the Data Services Hub built by MarkLogic and run by the Department of Health Services. The Department of Health Services ran that as a service for all the state services.

In a funny way, Oregon had a much smaller challenge — which they could not complete — than the feds had. For the federal market, we had to build the federal marketplace, as well as the Data Services Hub.

If I look at that Data Services Hub, it’s an extremely common problem that says, ‘I have lots of data coming from lots of places, and I need a united, 360-degree look at that data.’.. In the health market more broadly, how do you get a 360-degree look at the patient? That includes the hospital systems, the insurance system, the pharmacy, the bill payment systems, the lab systems…

It’s a very common problem and probably one of the most difficult technology problems customers are facing today — the need to integrate data from silos created over the last 50 or 60 years, both mainframe and relational-created silos.

Oregon agreed to settle with oracle for $25 million in cash and six years of oracle software…
When you look at Oregon, it’s natural to try to stick with historically dominant vendors. The general view is that’s the path with least resistance, with the least amount of risk. It often turns out that’s the most expensive approach, and it’s certainly the path of least value.

When I look at Oregon and I look at the settlement, they spent several years trying to deliver the system, they hired Oracle to do that work and had to throw the towel into the ring and sign up with the federal system.

This makes it a strange settlement. Essentially, the reason Oregon failed with the health care system is that they used their comfortable, incumbent technology. Now they’re saying as part of the settlement, they can modernize the state government’s IT system. Well, you’re trying to modernize with yesterday’s technologies…

They’re kind of doubling down on the incumbent technology that just failed them. I’m puzzled as to how the citizens of Oregon are going to benefit from this. Having almost unlimited use of the technology for six years seems a little bit strange given the massive scale of the failure of the health care system.

Why is this so challenging for other non-relational databases?
I don’t think it’s necessarily a technical challenge — it’s a challenge of the design point of your product. The design point for MarkLogic was always enterprise class customers. If I’m moving to the new generation from the prior generation, I don’t leave behind my enterprise requirements — that’s availability, performance, security — all the things a typical data center requires if you’re going to run your business on the product. The vast majority of the products in the NoSQL environment were not designed for enterprise class customers, they were designed for the web developer.

If you actually go back in history, when Oracle first came out they didn’t worry about those data center features, either… Oracle added a lot of those enterprise capabilities as an afterthought. We actually built all that capability into MarkLogic.

If you look at something like the federal marketplace for health care, this past year we ran upwards of 280,000 concurrent users, we ran about 5,500 transactions a second, so massive workloads. And then, oh by the way, we have IRS data, immigration data, credit data, personal and private information on residents of the United States — so security obviously is a requirement.

So we’ve always focused on that… In the current version we’re moving forward on MarkLogic 9 we’re going to take our NoSQL database and make it the most secured database in the world — not just the most secured NoSQL database.

You’ve said that about half your customers come with projects in hand that started with oracle. How do they end up working with marklogic?
There’s been a dominant database in the marketplace for now almost 30 years, being Oracle and the relational database model. What customers do is go to the technology they know… Real thought leaders think about next generation technologies, in a funny way, to solve a problem that relational technology was never designed to solve, and that’s the issue of integrating data from lots of different sources with lots of different structures.

What we’ve seen in our customer base is most customers start out with the incumbent product, whether that be DB2 or Microsoft Single Server or Oracle — Oracle more frequently than not because of market share. Then after literally sometimes years of frustration, they say there’s a different way to solve this problem.

They’re getting frustrated trying to integrate silos of data to get a 360-degree view of their customers or to integrate data for something like an operational or transactional system… Relational technology was never designed to solve that problem, so people move on.

Why not move to an oracle nosql product?
Oracle does have a product they label NoSQL technology — it’s very different than ours. We offer a document database, they offer a key value database, which is the Berkeley DB acquisition they did.

What happens is they push their incumbent product where they have dominant market share. They tend to not push their NoSQL product — it really isn’t competitive in the marketplace. If they convince a customer that relational isn’t the product to solve it, customers will move on to other technologies other than Oracle’s product… It’s kind of a sideshow for Oracle, and for us it’s our primary business.

Larry ellison slammed amazon on several points at oracle openworld, one of them being vendor lock-in. What did you make of his attacks?
When you talk about vendor lock-in, we’re no different than Oracle — we have customers running on Amazon, on Azure, on the Google Compute Platform, and obviously we have a lot of customers on their private cloud or on premise.

I think where the Oracle attack was a little bit misguided is its focus on platform-as-a-service You’re going there to get compute capacity, and it actually is relatively easy for customers to move from one compute platform to another. It’s very hard if you have to start rewriting all your applications. So the vendor lock-in actually happens at the application layer…

When I think about the focus on Amazon, I think about it more like about what happened 30 years ago with IBM. When Oracle started moving into the database market, IBM said we’re going to provide end-to-end services, which led to the creation of IBM Global Services and kind of a redefinition of essentially what IBM was… They decided, ‘We’re going to be a leader in the services business.’ That distracted everybody from what was happening to IBM’s core business 30 years ago. New technologies and new platforms were starting to erode IBM’s mainframe dominance. The exact same thing is happening today.

What Oracle’s starting to do now… they’re saying we’re a cloud provider — they’re creating a very similar distraction layer, saying, ‘I’m going to take Wall Street’s focus and move it from the database to the cloud and say, ‘Measure me on nothing but the cloud.’ In reality, Oracle’s core business is starting to be eroded by next generation technologies.

Larry’s focus, Mark Hurd’s focus, it’s really an applications level argument, it’s not necessarily tied to what’s going on in their database business.

One interesting thing we heard about at openworld is oracle’s turn to ai. What tools would you say are needed to meet today’s big data challenges?
At the foundation of artificial intelligence, machine learning and all these new tricks that everyone wants to apply against their data, at the heart of it is, I have to bring the data together. If I look over at Watson from IBM, if the data can’t get into Watson and I can’t load my data, Watson can’t be very intelligent about the data.

As Oracle starts focusing on AI technology, one of the requirements is I have to be able to integrate my data into a repository under which something smarter in the logic can start to figure out the meaning. This is exactly at the heart of the difference between Oracle and MarkLogic. Oracle essentially created data silos because of the rigidity of their technology — every time you wanted to do something with the data, you would create a new copy of the data for the different application.

We take all those data silos and integrate it together into a common repository, and AI becomes just one of the asks, no different than some people wanting to run transactions. Some people want to do real time analytics, some want to do search and discovery — so the idea is you need a flexible data platform where you can do all these different workloads from a common repository, exactly opposite of what happens in a relational database. In a relational database you keep making more copies of the data. Still today it suffers from its inability to process effectively unstructured data. At the end of the day, you’re only working with a subset of your data that fits nicely into Oracle row and column infrastructure.

Date: Sep 26, 2016

Source: ZDNet