Big Data: The Double-Edged Sword

In today’s climate, industries often talk about the new buzz word of this era, Big data. In this case, Big data refers to data on the macro scale (mostly unorganized and unstructured). The utilization of Big data has tremendous potential for various industries, healthcare included. Facebook, Amazon, Netflix, for example, incorporate big data for their digital structures, creating algorithms to match customers with their interests. Some experts in discussing big data have described it as the three or four Vs; Volume rereferring to a large amount of data; Velocity referring to the timely generation of data; Variety rereferring to the multiple forms of data (e.g., genetics, emails, numbers, surveys); Veracity referring to the quality of the data. I recently discussed the potential of big data in medicine in my expert review article (

For example, Big data can be utilized to improve decision-making when combined with other emerging technology such as artificial intelligence or quantum internet. It is possible that Big data can combine clinical characteristics (e.g., high HbA1C, high cholesterol, hypertension), multi-omics (e.g., genes, protein, metabolites), lifestyle (e.g., smoking cigarettes, exercise, physical activities, sleep hygiene), and environmental factors (e.g., air pollution, PM2.5, traffic noises) with artificial intelligence in future clinical trials. (Figure) As Dr. Jacqueline Tamis-Holland discussed in the AHA meeting today, current clinical trials do not confirm the genotype-guided antiplatelet therapy. However, this remains just a pipe dream at the present moment. So far, all Big data techniques are primarily descriptive and retrospective. In the future, with advanced computational power (aka quantum computing), leveraging Big data in medicine is promising.

Source: Krittanawong et al. JACC 2017

Big data also has its limitations, and there are several lessons we must learn before implementing it effectively. First, big data is never well-curated and comes with a large degree of heterogeneity. Thus, selecting the correct technology with human power to curate Big data is crucial. Second, analytic companies can misinterpret big data by using incorrect research questions to test their hypothesis or using the wrong tool to analyze the associated data, resulting in delivering false messages. Surgisphere is a prime example of what can go wrong through the analysis of big data. Surgisphere claimed to collect data from over 1000 hospitals worldwide. Although this is possible and emerging technology can accomplish this task with minimal human resources, it is unlikely that this data can also be well-curated. In addition, healthcare data is challenging to work with, as the integration of electronic medical records (EHRs) and data privacy are primary barriers. Another example is the Cambridge Analytica case, where data obtained from Facebook was used without consent.

When appropriately utilized, Big data can be a game-changer for various industries, including the healthcare industry. This requires well-curated data, pertinent research questions, transparency, appropriate analytic tools, and advanced computational powers. In the wrong hands, Big data can be a potent threat that can disrupt industries as a whole.