Data Science and Coding for Clinicians – Where to Start

Medicine is seeing an explosion of data science tools in clinical practice and in the research space. Many academic centers have created institutions tailored to integrating machine learning (ML) and artificial intelligence (AI) into medicine, and major associations including the AHA have created funding opportunities and software tools for clinicians interested in harnessing the promise of big data for their research.

While knowledge on the underlying algorithms and writing code is not necessary to lead a multidisciplinary team working in this space, there are those that want a working knowledge of what is happening under the hood. Thankfully, the computer science (CS) and AI communities have numerous free, online resources to help with this. As I embark on a Masters in Artificial Intelligence, I have used these courses as prep work and found them to be highly educational.

  1. Python for Everybody – By Dr. Charles R. Severance, University of Michigan

This course is meant to get those with no programming background up and running with Python. It focuses on understanding the underlying syntax of the language and the various data structures that come standard in Python. It also touches on web applications, SQL, and data visualization. Thorough, but approachable, this is a great place to start.

  1. CS50x – By Dr. David Malan, Harvard University

One of the most popular courses at Harvard, this course is an intensive introduction to computer science, focusing on key concepts and using various programming languages to illustrate them. The first half or so of the course teaches you to program in C, a low-level language that illustrates how a computer really functions, before moving on to Python (and various Python frameworks), SQL, and web programming. While the juice is definitely worth the squeeze, this course is a commitment and takes significant mental energy to get through.

  1. Machine Learning – By Dr. Andrew Ng, Stanford

One of the courses that popularized the massive open online course (MOOC) revolution, here AI visionary Dr. Ng takes you through a survey of ML/AI algorithms with real world examples and problem sets to work through. The main programming language is MATLAB. This course is enough to give you a basic overview of how these algorithms run and the types of data they are best at handling, serving as a solid introduction to the field.

  1. Machine Learning for Healthcare – By Peter Szolovits and David Sontag, MIT

Healthcare in general and the data it generates is unique, posing challenges distinct from other fields where ML/AI are commonly employed. This course highlights these points through a thorough investigation of healthcare data, common questions clinicians ask in routine patient care, and the clinical integration of ML. It touches on many different topics, including ML for cardiac imaging, natural language processing and clinical notes, and reinforcement learning. No coding is required for this course.

While these courses are just a start, they provide the groundwork for further investigation. in many cases, they are enough to develop an intuition of more complex material including deep learning. If these are topics that interest you, I encourage you to jump on in!

“The views, opinions and positions expressed within this blog are those of the author(s) alone and do not represent those of the American Heart Association. The accuracy, completeness and validity of any statements made within this article are not guaranteed. We accept no liability for any errors, omissions or representations. The copyright of this content belongs to the author and any liability with regards to infringement of intellectual property rights remains with them. The Early Career Voice blog is not intended to provide medical advice or treatment. Only your healthcare provider can provide that. The American Heart Association recommends that you consult your healthcare provider regarding your personal health matters. If you think you are having a heart attack, stroke or another emergency, please call 911 immediately.”


Big Data: The Double-Edged Sword

In today’s climate, industries often talk about the new buzz word of this era, Big data. In this case, Big data refers to data on the macro scale (mostly unorganized and unstructured). The utilization of Big data has tremendous potential for various industries, healthcare included. Facebook, Amazon, Netflix, for example, incorporate big data for their digital structures, creating algorithms to match customers with their interests. Some experts in discussing big data have described it as the three or four Vs; Volume rereferring to a large amount of data; Velocity referring to the timely generation of data; Variety rereferring to the multiple forms of data (e.g., genetics, emails, numbers, surveys); Veracity referring to the quality of the data. I recently discussed the potential of big data in medicine in my expert review article (https://www.tandfonline.com/doi/abs/10.1080/23808993.2018.1528871).

For example, Big data can be utilized to improve decision-making when combined with other emerging technology such as artificial intelligence or quantum internet. It is possible that Big data can combine clinical characteristics (e.g., high HbA1C, high cholesterol, hypertension), multi-omics (e.g., genes, protein, metabolites), lifestyle (e.g., smoking cigarettes, exercise, physical activities, sleep hygiene), and environmental factors (e.g., air pollution, PM2.5, traffic noises) with artificial intelligence in future clinical trials. (Figure) As Dr. Jacqueline Tamis-Holland discussed in the AHA meeting today, current clinical trials do not confirm the genotype-guided antiplatelet therapy. However, this remains just a pipe dream at the present moment. So far, all Big data techniques are primarily descriptive and retrospective. In the future, with advanced computational power (aka quantum computing), leveraging Big data in medicine is promising.

Source: Krittanawong et al. JACC 2017

Big data also has its limitations, and there are several lessons we must learn before implementing it effectively. First, big data is never well-curated and comes with a large degree of heterogeneity. Thus, selecting the correct technology with human power to curate Big data is crucial. Second, analytic companies can misinterpret big data by using incorrect research questions to test their hypothesis or using the wrong tool to analyze the associated data, resulting in delivering false messages. Surgisphere is a prime example of what can go wrong through the analysis of big data. Surgisphere claimed to collect data from over 1000 hospitals worldwide. Although this is possible and emerging technology can accomplish this task with minimal human resources, it is unlikely that this data can also be well-curated. In addition, healthcare data is challenging to work with, as the integration of electronic medical records (EHRs) and data privacy are primary barriers. Another example is the Cambridge Analytica case, where data obtained from Facebook was used without consent.

When appropriately utilized, Big data can be a game-changer for various industries, including the healthcare industry. This requires well-curated data, pertinent research questions, transparency, appropriate analytic tools, and advanced computational powers. In the wrong hands, Big data can be a potent threat that can disrupt industries as a whole.



Tech in Cardiology

Tech in Cardiology

On a recent flight from San Francisco, I found myself sitting in a dreaded middle seat.  To my left was a programmer typing way in Python, and to my right was an oncologist flipping through a slide set on chemotherapy trials.  While this may sound like the beginning of a bad joke, I remember this moment because it got me thinking about the influence of tech on medicine.  The purpose of my trip, by the way, was to interview for a fellowship position in cardiology, a specialty with arguably some of the most impressive tech.



Not to discount advances in medical devices (e.g. leadless pacemakers, bioprosthetic valves), the emergence of consumer-facing wearable devices is as trendy as ever.  Google recently collaborated with AHA to build its fitness app (Google Fit), which uses algorithms to quantify physical activity in terms of “heart points.”1  The Apple Health app now incorporates EKG capabilities, allowing patients to record episodes of arrhythmias—something I have certainly witnessed in cardiology clinic.2


Big data

Big data is an increasingly prominent component of clinical research, and a number of joint ventures with medical and tech leaders have emerged.  One Brave Idea3 is a research collaboration between AHA and Verily (Alphabet’s life sciences division) which uses genomics to study coronary artery disease.  Meanwhile, Verily’s Project Baseline4 is a massive longitudinal observational study—a modern version of the Framingham Heart Study.


Artificial intelligence

AI could eventually play a prominent role in medical diagnosis and decision-making.  The Stanford Machine Learning Group5 has developed a neural network that outperforms cardiologists in diagnosing arrhythmias on EKG—a significant improvement on existing algorithms which are often unreliable.  AI also carries vast potential in radiologic interpretation.  Already, Veril is using machine learning to interpret retinal images not only to detect diabetic retinopathy and macular edema but also to extrapolate information about cardiovascular risk.6



Electronic medical records represent an obvious space for tech innovation.  Fast Healthcare Interoperability Resources (FHIR) are making it easier to share health information across our disjointed EMR systems.  Providers are now able to push health data directly to patients’ iPhones using Apple Health Records.7  One can only speculate whether we will see a legacy software giant compete directly in the EMR space.


Cardiology and the rest of medicine has long excelled at basic science and translational research, but digital tech is increasingly creeping in.  We are in a tech zeitgeist, and this is good for both patients and providers.



  1. https://www.heart.org/en/news/2018/08/21/google-just-launched-heart-points-here-are-5-things-you-need-to-know
  2. https://www.apple.com/healthcare/site/docs/Apple_Watch_Arrhythmia_Detection.pdf
  3. https://www.onebraveidea.org/
  4. https://verily.com/projects/precision-medicine/baseline-study/
  5. https://stanfordmlgroup.github.io/projects/ecg/
  6. https://blog.verily.com/2018/02/eyes-window-into-heart-health.htm
  7. https://www.apple.com/healthcare/health-records/