I think our understanding of the current health data landscape is about as accurate as a map of the world in the 15th century: excruciatingly detailed in the areas we know well, but hilariously, woefully — even dangerously — incomplete when it comes to the areas of health and health care that lie beyond our grasp.
By way of illustration, here’s a map of the world circa 1490, by Henricus Martellus (above). Italy, where the mapmaker lived, is depicted in great detail while Africa and Asia are just rough approximations — and other continents, like Australia and the Americas, are missing altogether.
Another favorite depiction of myopic mapmaking is this whimsical drawing by Saul Steinberg for the New Yorker (at left).
Ninth Avenue in Manhattan dominates the foreground while the rest of the United States and the world is shown as a sort of after-thought, hardly worthy of the artist’s time.
What are we missing when we focus too closely on what is currently available? What treasures lie just beyond our view, in unexplored or walled-off areas of health and health care?
I found it useful this year to create a taxonomy of health data to remind myself and others that clinical data is just one small part of the puzzle of someone’s life. Here’s a slide listing some of the categories:
Sara Riggare’s illustration of the hours that she spends on self-care vs. the one hour with her neurologist is another wonderful “map” of health data.
What sources of health data are not yet on our radar — but should be? Where are the ripe areas of opportunity for people who want to create positive change, using data? What sources of data would you rather NOT include in a public map of your health? Comments are open.
If you will be in Seattle June 20-21, please come to Cambia Grove’s Interoperability Summit: Empowering Consumers with Data. My keynote will be a riff on this taxonomy theme AND I will reveal what I think is the ripest target for health data entrepreneurs.
Paul Wicks says
Nailed it. My mental map of Boston when we lived there was “Newbury St, Comm Ave, Place I Work, Place I Go To Conferences, Supermarket, and Logan Airport”. Drop me in Southie and I’d literally not know which way was up.
Similarly I’m pretty good at “online communities for adults with serious chronic health conditions who happen to speak English that’s mostly a web browser thing not a native app thing” – but much further out than that there be dragons.
This (https://mental.jmir.org/2019/4/e12292/) recent editorial from two colleagues of mine at PLM discuss theoretical vs practical challenges in one important area, mental health, and try to make the case that rather than being one “killer app”, patients use a whole range of approaches, many of which were never designed for health applications. To take a personal example, the recurring daily “take vitamin” item in my Things list and monthly “vaccinate the dog” don’t fit on many maps of health data, yet they’re pretty key.
Obviously that’s only the stuff I can see and know how to access. If I were a terrorist or something I’m sure Palantir could dig out all sorts of digital exhaust I didn’t know existed. (not a terrorist)
Susannah Fox says
Thanks, Paul!
When I typed in “EZ-pass, bus pass records” I was thinking about how my transportation patterns over the last few years would track along with my dad’s illness. The sicker he got, the more time I spent shuttling up to New Jersey. And my scale would show the stress eating & drop-off in exercising that came along with being a caregiver. Sadly, he’s gone, but happily, I’m working my way back to fitness — which my gym visitation records would show. But all of that is invisible to my clinicians & EHRs.
Here’s another issue that’s another illustration of data hidden in plain sight:
Shopping blindly for post-acute care is a recipe for disaster, by Lynn Rogut
Quote: “Current federal regulations restrict hospitals from recommending specific post-acute care providers as a way to avoid financial conflicts of interest…Patients or their family members usually select post-acute care based on location or word-of-mouth recommendations rather than on quality of care. But there’s a lot on the line since the wrong choice can increase the risk of rehospitalization, emergency visits, decline in physical or mental function, or becoming a permanent resident of a nursing home.”
And yet even the quality ratings data that exist are imperfect (to say the least). See:
‘It’s Almost Like a Ghost Town.’ Most Nursing Homes Overstated Staffing for Years, by Jordan Rau
Quote: “The payroll records provide the strongest evidence that over the last decade, the government’s five-star rating system for nursing homes often exaggerated staffing levels and rarely identified the periods of thin staffing that were common. Medicare is now relying on the new data to evaluate staffing, but the revamped star ratings still mask the erratic levels of people working from day to day.”
How might we find ways to surface useful data about rehab centers and nursing homes so that WHEN people go looking, they can find quality data to help them make the best choice possible?
M. C. Collet says
We have an expensive ($98,000,000 to-date), incomplete ALS registry at the CDC.
Instead we could give every person in the US a Costco card. A free Costco card would find a lot more people with ALS than the CDC has managed to find.
And then Costco knows where you live, what you eat, your basic economic situation, and what you buy for recreation. Bingo!
Susannah Fox says
Thanks for making the jump from Twitter! I love this idea (and suspect that there is a shadow health registry at every big retailer). How might we think creatively about partnerships among clinical centers, payers, retailers, and other stakeholders? Without, you know, being extremely creepy?
Madhuri Reddy says
Function (eg. Ability to perform activities of daily living such as bathing, dressing, grooming) and mobility need to be included. Extremely important, particularly for older people.
Susannah Fox says
Thank you! These are important activities to track and be aware of when making all kinds of decisions – such a good point. Who tracks this now? In a care setting, I’ve observed the physical therapy staff noting progress (or regression). But otherwise is there a system for tracking ADLs? For example, an app that family caregivers could use? PatientsLikeMe.com provides tools for tracking progression of disease — what other sites do?
Nooman says
Hi Susannah
Great article (and thanks for reminding me of the Steinberg maps!).
Several of the comments above have reference to ‘hidden’ data such as supermarket purchasing habits and so on.
I wanted to raise 3 issues.
Firstly a technical aspect that much of what we talk about in relation to health data is that which can be codified, manipulated, stored (and sold). But this is only one dimension of data. As we move through the world individuals and firms make many inferences using data that can’t be truly said to be stored anywhere but it has value. For example when I leave work late to get the subway I might be greeted by an ad on the platform for stress management therapy. Why? Because somebody inferred that on a subway in the financial district a person most likely to be standing there at night probably needs stress management. Compare this to when I get home and log on and am greeted immediately by an advert for stress management – because my local pharmacy captured by pill buying habits and sold them to Google.
Both are relevant yet only one seems to have value on the face of it.
Secondly, with data there is a notion (unintended perhaps) of a static set of information. This is where the ‘data as oil’ analogy partly comes from. But data is infinitely indivisible and amorphous in the main and I worry that health businesses built in ‘realising the value of data’ shift the boundaries of what’s relevant or not from a policy perspective – towards what can be captured financially.
Finally i’d ask how much data do we need to predict and manage health? Your list does a great job of highlighting formal and informal sources of health data. But I’m the UK for example postal codes are still the most significant explainer of health outcomes. By moving the focus away to exotic, detailed (and apparently financially valuable???) sources of data we’re missing the obvious – like how much wealth inequality is tied to outcomes.
Using larger and larger data sets might be useful but at the moment it only seems to add marginally to explanatory power. Again we run the risk of ventures created to help people that are predicated on some new valuable data set that actually doesn’t add that much.
Hope that makes sense? Please excuse typos and bad grammar. On the move at a conference so using mobile tech.
Susannah Fox says
Zounds! If this is what you can write on the move at a conference then I am impressed. Thanks for making the jump from Twitter to explain your intriguing comment about postal codes as a wealth (and therefore health) proxy.
Susannah Fox says
Lots of people suggested sources of data on Twitter — some made the jump to post on their own (thank you!) — but others haven’t yet, so I’ll add the ideas here:
1) Within clinical visits, subcategories of home health visits and school nurse/clinics. And telehealth, urgent care.
2) The Six Dysfunctional Pillars of Connected-Precision Medicine
3) school attendance data and credit card data (to expand purchases knowledge and incorporate socializing spend). Also salary data to determine ability to afford healthcare and food/housing in a given zip code
4) Financial – insurances, credit card, etc.
Residential – quality, maintenance of major appliances, water supply, a/c or heat access. Walkability-type stuff — can they even go outside and walk? Is fresh fruit or milk available in a .25 mi radius? 2mi? 20mi? Transportation options and accessibility? Days since major life event (family member passes, deployment or incident at loved ones or kids work/school, major move, promotion, kids entering life, etc)?
5) Are there weapons in the home-should that be on the list?
6) Looking over the list, curious which data has interoperability standards already?
7) Sleep patterns?
8) #ACES – adverse childhood events has a direct correlation to adult chronic disease so you might include that.
9) Housing- thinking of the uptick in people living in RVs due to housing shortages/costs.
If you see your idea listed, please feel free to elaborate! And keep the ideas coming, please!
Vince Kuraitis says
Susannah,
I really like the BREADTH with which you are portraying health data.
A relevant perspective from a recent McKinsey report:
“…the average patient will, in his or her lifetime, generate about 2,750 times more data related to social and environmental influences than to clinical factors”
https://mck.co/2JJvHNy
Vince
Rodrigo Martinez says
This is fantastic Susannah! I would add Genomic Data. As thousands – soon hundreds of thousands and millions, of people sequence their whole genome, our understanding of health will be radically transformed. Your genome is not just the largest digital file (~150GB) of all your health data, but is increasingly shedding light on disease risks, conditions we may carry and pass on, etc… Soon, every newborn will be sequenced at birth. We are entering, what is referred to as the emerging Social Genome era. https://www.veritasgenetics.com/next-genomics-revolution-era-social-genome
Susannah Fox says
Dropping a link here for everyone to read & consider when we have time (including myself)
Evaluating the predictability of medical conditions from social media posts by Merchant et al. PLoS ONE
Abstract:
“We studied whether medical conditions across 21 broad categories were predictable from social media content across approximately 20 million words written by 999 consenting patients. Facebook language significantly improved upon the prediction accuracy of demographic variables for 18 of the 21 disease categories; it was particularly effective at predicting diabetes and mental health conditions including anxiety, depression and psychoses. Social media data are a quantifiable link into the otherwise elusive daily lives of patients, providing an avenue for study and assessment of behavioral and environmental disease risk factors. Analogous to the genome, social media data linked to medical diagnoses can be banked with patients’ consent, and an encoding of social media language can be used as markers of disease risk, serve as a screening tool, and elucidate disease epidemiology. In what we believe to be the first report linking electronic medical record data with social media data from consenting patients, we identified that patients’ Facebook status updates can predict many health conditions, suggesting opportunities to use social media data to determine disease onset or exacerbation and to conduct social media-based health interventions.”
Another related (older) article: “When Healthcare Data Analysts Fulfill the Data Detective Role,” by John Wadsworth of Health Catalyst.
A quote: “Data detectives are story engineers. In a way, they give voice to the data. In the above story, Matt assembled narratives from data systems. He learned that orders for narcotics were placed within the EMR, but narcotics were released from locked cabinets that logged who actually accessed the cabinets and confirmed the physician order. Nursing shift data was kept in another isolated system that held time-tracking data.
A second function of the data detective is to work with other domain experts to bring together seemingly unrelated data. A good data detective is self-aware enough to know when he doesn’t know something. Technically, he is proficient at accessing, provisioning, data modeling, and analyzing data, but he often lacks meaningful context. Armed with data profiling efforts, he partners with domain experts to collaboratively focus on the narrative and the white space around the narrative. Domain experts can quickly spot issues within captured data because they know what should and should not be in the data capture process.”
Kristin Bennett says
What else to map, well I’d start with some of these things; alone time, and to get a glimpse into the ACE scores, things like “how happy do you remember being when you were __ years old?” Or how many times you think you argue with your people, for example spouse/children/roommate/parents etc.
Another angle that I’d love to see more research on, is about personal care products, vitamins that sort of thing, I call it my ‘Medical Analytics Fantasy’ or more recently I reference it as ‘Correlation Station’ since people love to downplay what can be seen with that kind of data. Imagine quizzes where people say what products they remember using at different ages, getting ahold of the ingredient list data from years back and then cross referencing that with studies and eventual effects of them on animals/people who used or were involved in case studies.
I feel like there is SO much hidden in these details, this is getting long for a comment though. Excited to see things that you find in your studies!!
Susannah Fox says
Thanks, Kristin, these are great ideas for further exploration!
Susannah Fox says
Since this blog is my outboard memory, here are a few more ideas and links shared on Twitter.
I saw a tweet from MD Environment warning of a Code Orange day for central Maryland. I shared it and wrote, “When we talk about health data, we should include air quality measures.”
Tatyana Kanzaveli replied: “Yes! You remember we did a talk presenting predictive disease rates at a zip code level using public datasets- one of them was environmental data!”
Chris Hogg of Propeller Health tweeted: “We model asthma/respiratory risk based on local environment. We opened a public API where you can send a location and receive air quality and asthma risk (based on air quality).”
Jane Sarasohn-Kahn shared a graphic from her book, Health Consuming, that illustrates that air pollution is a greater risk factor for mortality than HIV/AIDS, smoking, and alcohol & drug abuse. Data via the Air Quality Life Index project EPIC @UChiEnergy https://aqli.epic.uchicago.edu/
Chantal Kerssens suggested we add “noise levels” to the health data landscape.
Please keep the suggestions and resources coming!
Smith says
Another angle that I’d love to see more research on, is about personal care products, vitamins that sort of thing, I call it my ‘Medical Analytics Fantasy’ or more recently I reference it as ‘Correlation Station’ since people love to downplay what can be seen with that kind of data. Imagine quizzes where people say what products they remember using at different ages, getting ahold of the ingredient list data from years back and then cross referencing that with studies and eventual effects of them on animals/people who used or were involved in case studies.
Susannah Fox says
I love that idea! If we are playing What If, I’d get permission to match a person with their grocery, drug store, bodega purchases and create an easy check list — was this for you? Do you remember eating/drinking/taking this? That way we’d prompt their memory a bit.
Susannah Fox says
Adding to my outboard memory, here’s an excerpt from an article published in JAMA on 9/12/19 which points out another big missing data set: “Primary Care Selection: A Building Block for Value-Based Health Care” by Scott Heiser, MPH; Patrick H. Conway, MD, MSc; Rahul Rajkumar, MD, JD
“…the Office of the National Coordinator for Health Information Technology should explore whether a data standard could enable the collection, storage, and sharing of the PCP selection data of all US patients across the continuum of care. The same way that data standards were created for the admission, discharge, and transfer data that is now proposed to be a requirement of participation in Medicare, so too should be the capture and dissemination of PCP selection data. This would further ensure that the often-repeated maxim of ‘right care, right place, right time’ can actually occur if the PCP of record can be identified at any point in the delivery of care.”