Session #10:Is Big Data a Big Deal or Not?Pre-Session Poll QuestionIs big data a big deal for healthcare?Most definitely. Big data will revolutionize the way we look at and practice healthcare.Yes, but its overhyped and at least 3 years away.I havent decided yet.Not really; humans are too complicated to fully understand with data.Definitely not. Like EHRs, its just the next big thing with no positive impact.Dale SandersEVP, Product Development | Health CatalystRichard ProctorGM Healthcare | Hortonworks #HASummit141
You are in the technology-enabled services businessyou leverage technology to deliver the highest-value professional health management services.
Your business now runs at the speed of software2
The Adoption Of EHRs
Meaningful UsePatient EngagementCare Management#HASummit14
The Empowered PatientRelationship Focused(Values stable doctorsrelationships)Detests/suspicious of GadgetsSerial Processor(Email, basic model phone)Loyalty to BrandHealth as a serviceAccess to CarePatient (willing to wait)Believes experts (not comfortable seeking second opinions)Goes for Best of Breed (Network of Networks)Super ConnectedParallel Processor-Integrated (State-of-the-Art)Loyalty to ValueTake care of HealthHealth as a rightImpatientAsks for Data (researches multiple self enabled searches, demands second opinions)My ParentsMy Children
#HASummit14Healthcare providers are the only entity that are currently capable of changing the systemProviders
#HASummit14Change Comes From The MarginsYou signal your determination to belong to the center by unreserved conformation to its standards. For all these elaborate protocols are also meant to ensure that not everybody gets in and that enough are left out; this way the center makes itself perpetually desirable. Since most do what they can to keep their place within the status quo, far-reaching innovation is usually discouraged, and conformism rules. Professor Costica Bradatan, Texas Tech University 5
#HASummit14Digitization of Personalized, Precision Health6Its just beginningThe Ecosystem of Human Health DataHealthcare Encounter Data7x24 BiometricDataOutcomes DataConsumer DataGenomic and Familial DataSocial Data#HASummit14
Late Binding in Data Engineering
Real LifeData Is Also Alive#HASummit14Relational vs. Hadoop Analytic Technology"Technology Life Cycle". Licensed under CC BY-SA 3.0 via Wikimedia Commons https://commons.wikimedia.org/wiki/File:Tecnology_Life_Cycle.png
0TimeBusiness GainVital LifeR&DThe Technology Lifecycle PathADLMMaturityHadoopRelational databases for analytics#HASummit14
Big Data vs. Throughput
Big Data vs. Throughput
#HASummit14Volume, Velocity, and Variety of DataWhat is so compelling about Hadoop in comparison to relational database technology?Key value data vs. relational data: XML, textDeclarative vs. procedural programmingComplex analytic flows with multiple inputs and outputsYARN, HIVEOptimization for relational data base engines is a black art; not so with HadoopBlazing fast performance allows for complex algorithmic/machine learning/modeling, including real time analyticsGranular securityScale out, not up. A 4x server is way more expensive than adding 4 PCs to a nodeLicensing savingsStructured vs. Unstructured data dont wait until unstructured data is knocking on your door. Prepare now.
90% of all information created by humans originated in the last 2 yearsWe are producing 3 exabytes of data per day 1 million terabytes#HASummit14The 5 Key Pillars of HDP HDP 2.1Hortonworks Data PlatformProvision, Manage, & MonitorSchedulingData Workflow, Lifecycle, & GovernanceYARN: Data Operating SystemScriptSearchSQLNoSQLStreamOthersIn-MemoryGOVERNANCE & INTEGRATIONOPERATIONSAuthenticationAuthorizationAccountingData ProtectionSECURITYBATCH, INTERACTIVE, & REAL-TIME DATA ACCESSDeployment ChoiceLinux Windows On-Premise CloudHDFS (Hadoop Distributed File System)DATA MANAGEMENT#HASummit14The Hortonworks Data Platform therefore addresses all of these capabilities completely in Open Source.YARN is the architectural center ogf HDP and Hadoop that not only enables multiple data access engines across batch, interactive and real-time to all work on a single set of data but also extends Hadoop to integrate with the existing systems and tools you already have in your data center. HDP delivers on the key enterprise requirements of governance, security and operations. And of course it is supported on the widest possible range of deployment options: from Linux, to Windows (the only hadoop offering on Windows), appliance (from Microsoft or Teradata) or Cloud (Microsoft, Rackspace and more).HDP is a comprehensive data management platform with one goal in mind: to enable an enterprise architecture with Hadoop.
MOVED TO NOTES FROM THE SLIDE:The widest range of deployment options Linux & WindowsOn premise & cloud
Only Hadoop Distribution for Windows Azure & Windows ServerNative integration with SQL Server, Excel, and System CenterExtends Hadoop to .NET community
13Hadoop Ecosystem14This slide was outdated the moment I borrowed it.https://hadoopecosystemtable.github.io/ for the latest
Thank you for the graphic, Aryan Nava
Its the community of Hadoop that makes the difference.Its not the technology that matters.Its how you leverage the technology.Hadoop is like an operating system for data management.It grew up in the era of data and leapfrogged Oracle, Microsoft, and IBM.#HASummit14Gartner Survey 201515
Hadoop Investment Plans#HASummit1416Obstacles to Hadoop adoption
Commodity Compute and StorageTRUECarStorage costs/Compute costs from$19/GB to $0.23/GBUCIrvineStorage costs and licensing reduction of latent systems$500,000ZirMed5 times the amount of usablestorage and processing power for about 30% the cost of traditional enterprise technologies#HASummit14Hadoop was designed so that it could run on commodity hardware so that build out and scale could be accomplished in a cost effective manner. Over the past few years, the cost of storage has plummeted making Hadoop, with its linear scale storage and compute an ideal place to store a lot of data at low cost. Many use HDP for exactly this. Simply stated, HDP allows you to store more data for longer and at less cost.For instance, Hortonworks customer TrueCar recently analyzed their before/after costs with Hadoop and HDP to store a GB of data and it dropped from 19$/GB all the way down to 0.23GB/PB. This is a dramatic cost savings that not only funded the project but also ad immediate effect on their bottom line as they marched to IPO. It increased the valuation of the company!
EDW OptimizationCurrent RealityAugmented w/ HadoopFree up EDW resources from low-value tasksKeep 100% of source data and historical data for ongoing explorationMine data for value after loading it because ofschema-on-readAnalytics20%Operations50%ETL Process30%Operations50%Analytics50%EDW at low capacity: Some usage from low value workloadsOlder data archived, unavailable for ongoing explorationSource data often discarded#HASummit14Hadoop was designed so that it could run on commodity hardware so that build out and scale could be accomplished in a cost effective manner. Over the past few years, the cost of storage has plummeted making Hadoop, with its linear scale storage and compute an ideal place to store a lot of data at low cost. Many use HDP for exactly this. Simply stated, HDP allows you to store more data for longer and at less cost.For instance, Hortonworks customer TrueCar recently analyzed their before/after costs with Hadoop and HDP to store a GB of data and it dropped from 19$/GB all the way down to 0.23GB/PB. This is a dramatic cost savings that not only funded the project but also ad immediate effect on their bottom line as they marched to IPO. It increased the valuation of the company!
19Business AnalyticsCustom ApplicationsPackaged ApplicationsMercy HealthInitial bulk load via sqoop from Oracle/Clarity EDW to HDP.They then capture deltas every3-5 minutes from cache into HDP.Near Real-Time Analytics with Epic
EpicOracle ClarityNear Real-timeSqoopApplicationsData AccessData ManagementGovernance& ImmigrationSecurityOperations#HASummit14TALK TRACKMercy runs 35 hospitals and 500 clinics serving 1 million patients annually in Missouri, Oklahoma, Arkansas and Kansas They analyze data to improve their operations, for billing and insurance reimbursements, and most importantly to improve how it cares for its patientsThey wanted to create a single view of their patients across those three processes.Most of Mercys data was in Epic, their system for electronic medical records, but important data existed elsewhere, outside of Epic.
SUPPORTING DETAILMidwest healthcare system with 35 hospitals and 500 clinics serving 1 million patients annually in Missouri, Oklahoma, Arkansas and Kansas | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHgMercy analyzes data from three different perspectives: operations, finances and medical outcomes | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHgExisting data architecture was hard to scale, and its data schema limited the types of data that Mercy could ingest to enrich Epic data | Source: http://www.healthcare-informatics.com/article/moving-data-down-i-44-and-making-it-actionable | Quote: When you start to drill down and ask, Who are the patients that dont match these criteria? you find the analysis gets more complicated and everything starts bogging down, [Paul] Boal says.
TALK TRACKMercy pursued many use cases on their journey to HadoopOn the clinical side, they replicated their Epic system in Hadoop and then enriched that data with other sources for a single view of the patientMercy is working towards being able to present this enriched data to clinical co-workers inside of Epic, while they are at the bedside
SUPPORTING DETAIL25 terabytes of Epic data replicated in HDP for deeper analysis | Source: http://www.healthcare-informatics.com/article/moving-data-down-i-44-and-making-it-actionableMercy enriches its Epic data with other 3rd-party data | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHgMercy is working towards being able to present this enriched data to clinical co-workers inside of Epic, while they are at the bedside | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHg
TALK TRACKHaving all that enriched data in HDP has had many other benefits for doctors and patientsClaims reimbursements are more efficient. Claims coders can now verifying the patients chart while he or she is still in the hospitalResearchers have access to lab notes that they could never see in the aggregateIn one example, researchers ran a query on 19,000 individuals. It took 2 weeks on the prior platform, but only half a day in Hadoop.Mercy hopes to leverage vital sign data from medical devices for predictive models and preventive care
SUPPORTING DETAILReimbursement claims coders now verify the patients charts while they are still in the hospital | Source: http://www.healthcare-informatics.com/article/moving-data-down-i-44-and-making-it-actionable | Quote: The main purpose for using the open-sourced framework, for now, is to improve medical documentation. Boal says that Mercy is aiming to get documentation up to speed by the time the patient is discharged. This not only helps the providers with actionable data but coders, who can get the right information in the chart and create a more efficient reimbursement process. [The coders] have better information up front about the patients that are in the hospital right now and which patients they should be focusing on when they do their chart reviews to review with physicians during huddle times, he says.Mercy labs now searches through terabytes of free-text lab notes, speeding researcher speed to insight from never to seconds | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHgResearchers estimated that one query on 19,000 individuals would take two weeks on the pre-existing platform, but it ran in half a day on HDP | Source: Video on www.hortonworks.com/customers -- https://youtu.be/iQ2V3FuqxHgMercy hopes to leverage vital sign data from medical devices for predictive models and preventive care | Quote: What were building out is a real-time clinical applications platform, so were looking for other opportunities to turn that into decision support, Boal says. Were looking for folks that are interested in device data integration. | Source: http://www.healthcare-informatics.com/article/moving-data-down-i-44-and-making-it-actionable
Big DataBig Data is not just for theIT department anymoreBenefitsProactively predict events rather than reactivelyReal-time alertsCapture and transmit patient vitals at much higher frequenciesImprove patient satisfactionImprove operational efficiencyImproved response timesReduce adverse drug response timesMonitor Patient Vitals in Real-Time with Sensor DataProblemManaging the volumes of system sensor dataIn a typical hospital setting, nurses do rounds and manually monitor patient vital signs. They may visit each bed every few hours to measure and record vital signs, but the patients condition may decline between the time of scheduled visits. This means that caregivers often respond to problems reactively, in situations where arriving earlier may have made a huge difference in the patients well being.
SolutionHadoop empowers healthcare by converting high volumes of sensor data into a manageable set of dataNew wireless sensors can capture and transmit patient vitals at much higher frequencies, and these measurements can stream into a Hadoop cluster. Caregivers can use these signals for real-time alerts to respond more promptly to unexpected changes.Over time, this data can go into algorithms that proactively predict the likelihood of an emergency even before that could be detected with a bedside visit.#HASummit1421
Big DataBig Data is not just for theIT department anymoreProblemSlow delivery of medical products can waste supplies, increase costs, and harm medical outcomesMedical supplies and pharmaceuticals are time sensitive and climate-controlledEpidemics require agile changes to delivery schedulesComplex delivery logistics are complex and subject to risks outside of the companys control (e.g. product availability, weather, and traffic)
SolutionMonitoring with sensor data protects the supply chain and reduces waste today, and improves logistics in the futureData from SAP and EDI in HDP gives unprecedented supply chain visibilityETL offload increased data retention from 1 to 7 years of data, with daily updatesBetter tracking reduces waste, improves customer confidence, and patient healthHistorical data informs long-term strategic investment decisionsWhy Hadoop?Data discoveryHealthcareSupplier of pharmaceuticals and medical products to pharmacies and hospitalsMonitoring of Healthcare Supply Chain to Minimize Waste#HASummit1422
Big DataBig Data is not just for theIT department anymoreHealthcarePublic university teaching hospitalProblemInability to store and access sufficient data for medical decision support in real time9 million patient records on a legacy system were not searchable nor retrievableCohort selection for research projects was slow, despite abundance of dataClinicians had minimal access to historical data gathered across all patientsSolutionWhy Hadoop?Predictive analyticsUnified data lake improves patient health, speeds researchLegacy system retired immediately, saving $500K in annual recurring expenseRecords stored with patient identification for clinical use, same data presented anonymously to researchers for cohort selectionWireless patches transmit vital signs, algorithms notify doctors of high risk patternsHeart patients weigh themselves from home, algorithms notify doctors about unsafe weight changes and recommend a visit to the clinicImprove Patient Treatment with Real-time Monitoring of Vital Signs#HASummit1423
Big DataBig Data is not just for theIT department anymoreRole at MerckManufacturing innovation and analytics at MerckSolutionSingle view of data is the holy grail for yield and quality optimizationHow to predict when a machine is going to failHow to improve yield and enable feedback controlHow this is being enabled by the Hortonworks Data Platform
Plans for the FutureAnalyze streaming machine sensor data in real timeProactively minimize yield variabilityPredictive equipment maintenanceKey Challenges Data silosHigh cost of data retentionHigh cost of testing hypothesis in the real worldYield, Quality, and Process Optimization at Merck#HASummit1424
Big DataBig Data is not just for theIT department anymoreManufacturingMajor automotive OEMProblemAutomaker sought to transform auto manufacturing by pursuing multiple big data use cases Two mature enterprise warehouses (Microsoft and SAP) required cost-prohibitive ETL processing to capture sensor and machine data with variable structuresCompany CIO mandated first Hadoop cluster within 7-month time frameGoal: archive 1.4 petabytes of manufacturing data for predictive analytics
SolutionHDP Data Lake established as Hadoop architectural standard to support multiple big data use cases, including.Manufacturing optimization: Two-mile auto painting assembly line monitored with more than 100K sensors measuring temperature, humidity, and paint mixturesStreaming usage data from cars (post-sale) for valuable engineering insightPredictive analytics for extreme testing send alerts to prevent engine explosionsData-as-a-service for retailers on returns and sales velocityWhy Hadoop?Predictive analyticsEnterprise Data Lake for Auto Manufacturing#HASummit1425Lessons Learned26Margin vs. Marginalized: What sort of healthcare legacy do you want to leave behind?Technology Enabled Services: Like it or not, fast or slow, healthcare runs at the speed of software now.No Question: The Hadoop ecosystem is revolutionizing data management and analytics.Adoption Curve: Start learning and adopting now; be ready for full adoption in 2 to 3 years.The Hadoop Community: Incredible velocity and quality of collaboration to commoditize this revolutionary technology. Its not the technology that matters; its how you use it.
#HASummit14Follow up group participation1Would you like to participate in a follow up group on this topic that would meet 2-3 times next year to share progress, challenges and best practices? (Yes, No)
26Analytic InsightsAQuestions & Answers
27#HASummit14Choose one thing28Write down one thing will you do differently after hearing this presentation #HASummit14Follow up group participation1Would you like to participate in a follow up group on this topic that would meet 2-3 times next year to share progress, challenges and best practices? (Yes, No)
#HASummit142930 Session Feedback SurveyOn a scale of 1-5, how satisfied were you overall with this session?Not at all satisfiedSomewhat satisfiedModerately satisfiedVery satisfiedExtremely satisfiedWhat feedback or suggestions do you have?#HASummit14Follow up group participation1Would you like to participate in a follow up group on this topic that would meet 2-3 times next year to share progress, challenges and best practices? (Yes, No)
30Upcoming Speakers3:45 PM 4:35 PM
Delivering Excellence at Stanford Health Care Amir Dan Rubin, President and CEO, Stanford Health Care
4:35 PM 5:00 PMThe Future World of Value-Based Healthcare (Documentary featuring Michael Porter) Caleb Stowell, MD, Vice President, Research and Development, International Consortium for Health Outcomes Measurement (ICHOM, Senior Researcher, Harvard Business School)31LocationGrand BallroomGrand Ballroom#HASummit14Follow up group participation1Would you like to participate in a follow up group on this topic that would meet 2-3 times next year to share progress, challenges and best practices? (Yes, No)