The human genome is a treasure trove of data that can be used to study diseases, develop medicines and more. However, analyzing the data can be difficult without the right tools and knowledge. In this tutorial we’ll explore how to build visualizations using Python and Jupyter Notebook in order to make sense of genetic information.
The word “gene” was first used in 1905 by Danish botanist Wilhelm Johannsen to describe heritable units of inheritance. DNA, or deoxyribonucleic acid, is the molecule that carries genetic information and is made up of nucleotides. These nucleotides are four different molecules: adenine (A), cytosine (C), guanine (G) and thymine (T). The sequence of these base pairs make up the human genome.
In 1953, James Watson and Francis Crick discovered the double helix structure of DNA when they studied X-ray diffraction images taken from crystalized samples of tobacco mosaic virus which had been wrapped around rods of nickel chromium steel (NiCr).
The human genome is a sequence of chemicals called DNA that contains all the information for building and maintaining an organism. The human genome has 3 billion units, or base pairs, of DNA and weighs approximately 150 pounds in total!
A genome can be sequenced by determining the exact order of nucleotides (A, C, T, G) on a strand of mtDNA or nDNA. The Human Genome Project was an international effort to explore our unique genetic makeup used to map out this information in full detail (see below).
Once you’ve imported your data, you can start filtering and visualizing it. Filtering means that you pick out the parts of your dataset that are most interesting to look at. For example, if we were analyzing people’s genes to see what makes a person tall or short, we could filter out all the people who aren’t tall (or aren’t short).
Visualizing is how we convert our filtered data into something we can see on screen. This will usually involve making some kind of graph from your dataset—a bar chart showing height measurements for example.
Interpreting is where things get interesting! When interpreting your visualization, think about what it means in terms of real life situations (e.g., why do taller people have shorter lifespans?). If there are any outliers in the data set—people who don’t fit with what’s expected—why might this be?
The end result is a data visualization that can help you understand the human genome. The visualization shows a clear relationship between the number of genes in a particular region and the number of people with a certain genotype in that region.
If you need some motivation, just know that the hard work will be well worth it in the end.
If you’re looking for some motivation, just know that the hard work will be well worth it in the end. Not only will you gain valuable skills in data visualization, but you will also be able to:
- Learn new tools and technologies – Whether it’s mapping software or a programming language like Python, there are many different methods of analyzing genomic data. By gaining experience with these tools and techniques, you’ll be able to explore diverse ways of visualizing your data.
- Work with a team – The genome is one of humanity’s most valuable resources; it contains so much information about our pasts, presents and futures (and those of our children). Working on genome projects gives us the opportunity to share our findings with others through publications or presentations at conferences such as BioVis 2018!
- Explore your own dataset – Because every individual has their own set of mutations on each chromosome (i.e., 10 percent heterozygous), there are many ways that we can represent this type of variation using colors rather than numbers alone! For example: If someone has red hair then they might have inherited both copies from Mommy Dearest while Dad was blond haired too; therefore we could represent them by having two colors where one would usually suffice!
The human genome is a fascinating topic. In this article, we’ve explored some of the most common ways that humans use DNA data and how it impacts our lives. But there are plenty more applications for genomics research than we’ve discussed here! We hope you have enjoyed learning about these topics as much as we have enjoyed researching them for this post.