Wednesday 28 June 2017

Python Bioinformatics : How to count nucleotides in DNA sequences

counting nucleotides is an important task in bioinformatics, one would need to write a long procedural code block in languages like Visual Basic to achive this task. However, in Phython 3.5 it is just the mater of few lines of codes. below is an example of a simple method used to write such a software to count nucleotides frequency of a DNA sequence.

>>> DNA = input("please enter your DNA sequence below: \n")
please enter your DNA sequence below:
"ATGCTGGATGCACACCGTCGATCGTATATTAAA"
>>> A = "A"
>>> T = "T"
>>> C = "C"
>>>  G = "G"
>>> print("Adenine count is :",DNA.count(A))
Adenine count is : 10
>>> print("Thymine count is :",DNA.count(T))
Thymine count is : 9
>>> print("Cytosine count is :",DNA.count(C))
Cytosine count is : 7
>>> print("Guanine count is :",DNA.count(G))
Guanine count is : 7
>>> print("the total number of Nucleotides in your DNA seq is :",DNA.count(A)+ DNA.count(T)+DNA.count(C)+DNA.count(G))
the total number of Nucleotides in your DNA seq is : 33


_____________________________________________________________________________
To understand this procedure more easily you may need to watch the video tutorial below:

"CLICK HERE"




 

Thursday 22 June 2017

Variables in Python in the biological context

Variables in Python in the biological context


There are four main types of variables in Python [ or any other programming language], those are [ String, Integer, Float, and Boolean].

A string variable is a text data assigned to a variable as shown below:
>>> DNA = "TCGA"
>>> print(DNA)
>>> TCGA
An integer variable ,on the other hand, is a numerical variable that contains whole number data, as shown below:

>>> print(DNA.count("A"))
1

a float variable holds numeric data with decimal places, like:


>>> 1/4
0.25
>>> print("A ratio :", (1/4)*100, "percent")
A ratio : 25.0 percent

while boolean variables are used in logical tests and statements, like:

>>> if (True):
print("A" in DNA)


True


or 

>>> if (True):
print("U" in DNA)


False


this is summarized in the link video below:

Variables_LINK



Saturday 3 June 2017

print function in python : print()

one of the most important and unique feature of python, which made me start using it, is the fancy print() function. Grab any introductory book  (like: Beginning programming in python for dummies ) about Python 3 and it tells you that you can write a software by printing ("hello world !") on the screen. So let's start using this function to print out certain fancy biological data on the screen.







what the above code does is that it puts "TCGATG" in the DNA variable. Then the print() function shows it on the screen. So, whenever you write print(DNA) it shows you the content of DNA variable which is TCGATG .
It is important to know a little about variables as well, and variables are virtual boxes that can hold different types of data.
there are several types of variables which are basically String "for text", Integer for whole numbers, boolean for True and False data, as well as float for data with decimal numbers or fractional values.  The DNA variable holds a string which is a text datum.
Since python 3 uses print() as function you can do so many tasks with it, I am going to show few below:
1- reversing a string:
you may want to get the reverse of a DNA strand, and you can do that in python using a single line of code by typing
>>> print(DNA[::-1])
the result would be as follows:
GTAGCT
which is the reverse of TCGATG
this is quite handy and if you use other programming languages to do the same it would take you a while to figure out how to do so.

2- you can concatenate your data with a text to provide information about the data shown as an output.
>>> print ("this is your DNA sequence", DNA)
this is your DNA sequence TCGATG

3- you can write tandems of your DNA sequence by simply multiplying your text by the number of times you would want that tandem to appear.
>>> print(DNA *2)
TCGATGTCGATG
4- you can transform a variable into a list by inserting [ ] in the print function.
>>> print([DNA])
['TCGATG']
it is quite important to use lists while writing functions/ or methods for bioinformatics tools and software as it helps iteration easier and more reliable [read about iteration over strings in: Python recipes handbook by Joey Bernard].
5- if you would like to make every item of your DNA variable a list item use the following code block:








NOTE: this is a for loop , and it basically iterates over the items of the string, then makes each item a list item and finally separates them with a , the end = ',' command is what separates the list items from one another.
6- you might also want to start counting nucleotides in a DNA sequence. To do so, use the following line of code:
>>> print("your DNA sequence contains", DNA.count("A"),"Adenine")
your DNA sequence contains 1 Adenine
7- using if statements is another useful way to print out your results.
>>> if True:
print(DNA.count("T"))
2

8- one can also use .split() function inside print() function to split a string into lists of desirable length. This would work based on the number of spaces that are there within your string. In this case NONE.
>>> print(DNA.split())
['TCGATG']
split strings could also be joint by .join() function.
for instance:
>>> a = "+" # this is a variable that holds + as a datum
>>> seq = ("A","T","G","C") # this is another variable that holds A,T,C,and G as as data
>>>  print(a.join(seq)) 
A+T+G+C








Project Genetic Analysis Toolpack (GAT V1.0)

The name GAT stands for Genetic Analysis Toolpack and we are aiming to make it a useful molecular data analysis tool, and more importantly...