Disease Motifs
Investigating the Biomechanics of Disease
Resources About this site Biological Research Services Contact this Site

This site uses cookies. Please allow the adverts to be displayed as these may help fund this site.

Disease Motifs - A Wild Speculative Excursion into the Field of Bioinformatics

I am in the process of writing a book about this work in general with all the explanations necessary and with some of the results of this research. Hopefully this might help and inspire other individuals to become involved.

You will find some of the Perl scripts code, which might be needed to get you started making your own disease protein datasets, on the additional resources page and these are also included in the book's appendices.

Please be aware that this instalment has a lot of colour images, so you might well think to download it on to your computer, if your Kindle or other device does not show coloured images properly.

If you are interested in buying this work please click on the link below.
Disease Motifs - A Wild Speculative Excursion into the Field of Bioinformatics

Additional Resources for the Book

There are some additional resources that will be available located on the Additional Resources page.

Parkinson's Disease (PD)

There were 103 proteins identified with some association with Parkinson's disease, one of which (AAKG2_HUMAN) was discarded for being a false positive, leaving 102 proteins that will be looked at here. This work has been done with databases that were downloaded from the EBI around the end of 2018 and start of 2019.

Unlike the previous occasion, I decided to keep Beta-synuclein (SYUB_HUMAN) in the dataset because of its connection with Alzheimer's disease which I think might be of greater interest, even though strictly speaking this protein can be seen as a false positive.

The basic imagery will be displayed here but the more in-depth analysis will be written for the second instalment/chapter for the book that I am writing (see below) and this will hopefully be released some time in the latter parts of 2019....hopefully, all being well.

To see the protein sequence analysis on the proteins that are associated with Parkinson's disease click on the following links:
Parkinson's disease dataset 2019
Set theory with the Parkinson's disease dataset 2019

Alzheimer's disease

There are a total of 98 proteins that seem to have some mention of Alzheimer's disease within the comments section of the protein datasheet. None were excluded. This work was done using a database that was downloaded from the EBI at the end of 2018 and the start of 2019.

To see the protein sequence analysis on the proteins that are associated with Alzheimer's disease click on the links below Alzheimer's disease protein dataset 2019
Set Theory with the Alzheimer's disease protein dataset 2019

Prion Diseases (CJD FFI and GSS)

Three dimensional structure of HuPrP(90-231 M129 Q212P).  DOI:10.2210/pdb2kun/pdb  2KUN.  Image produced courtesy of RCSB PDB
Human Prion Protein showing amino acid residue 90 to 231 containing the Q212P mutation that is believed to increase the probablity of an individual developing Gerstmann-Sträussler-Scheinker Disease GSS.
Image courtesy of the RCSB PDB
Some basic research into Prion diseases such as CJD, GSS and FFI. Largely work done on the prion protein as well as Cystatin-C and Gamma-enolase. The latter two more in regarding CJD.

Journal of Alzheimers Disease & Parkinsonism

Preliminary set theory-type analysis of proteins associated with Parkinson's disease
- published paper 2014


In an attempt to create a model of Parkinson's disease (PD) eighty-three proteins were extracted from the Swiss-Prot protein database that had some casual mention of PD. These were split up into various subsets of proteins of which three are focused on here: PARK, made up of proteins that had some indication that polymorphisms in the protein might increase a person's susceptibility to develop PD; MITOCHOND, proteins which had some association with the mitochondria; and MT-C1D, proteins that were implicated in mitochondrial complex 1 deficiency. The PARK subset had 21 out of 83 proteins (21/83); MITOCHOND 33 out of 83 proteins (33/83); and MT-C1D 17 out of 83 proteins (17/83). The results could be used to build up a basic model of PD creating phenotypes based on sets of proteins. The main phenotypes established here are; non-mitochondrial PD (50/83) and mitochondrial PD (33/83). Further division is possible dependant on whether proteins have polymorphisms which increase susceptibility to develop PD. MT-C1D seems to be independent of the PARK set. This is a very simplistic attempt at trying to model Parkinson's disease at the proteomic level and will need further work to build up the more complex and realistic PD proteomic disease model.

Theoretical Biological Switch With a Possible Important Role In Parkinson's Disease, Schizophrenia, Hyposmia and the Onco-Parkinson's Mechanism in A Collection of Bioinformatics Papers

This is available to buy from Amazon


Antagonists for the Histamine Receptor H2 (HRH2_HUMAN) protein may have some efficacy in reducing levodopa-induced dyskinesia in Parkinson's disease. Using short sequence proteomic analysis on the histamine receptor 2 which involves splitting this protein sequence into smaller fragments of ten amino acids and contrasting (using BLAST) against the human protein database.

Three main motifs were found; with the initial two being named the CW-motif and the NxxxNP-motif, with the adjacent amino acid residues being important. The third motif NPxxY overlays the end of the NxxxNP-motif and may form part of a rhodopsin NPxxY(x)5,6F type motif but in this case, it is a variant form NPxxY(x)6,7F present in many of the serotonin and dopamine receptors found here. Using the variants of NPxxY found in the dopamine and serotonin receptors, as a specific amino acid search-term (non-BLAST), it was found that a high proportion of the proteins returned were olfactory proteins.

This small region of similarity amongst dopamine, serotonin and olfactory receptors might be suggestive of a biological switch which might play an important role in Parkinson's disease, schizophrenia and the onco-Parkinson's effect seen in people with Parkinson's disease (who may have raised or lower risk of developing certain types of cancer).

There is more data available on the Project 3 page

Predicted Onco-Motif

The whole nine page article is available to buy. See Predicted Onco-Motif on Amazon.

Figure 1 Showing the number of BLAST hits returned per small sequence


Using short sequence similarity searches to find motifs that might have some associated with a pathology e.g. an onco-motif.
A protein sequence is split-up into small sequences of 10 amino acids long and these are compared against the Swiss-Prot protein database using BLAST. A multiple sequence alignment is used to find areas of similarity on some of the sequences.
Two maxima were found at sequences 169 and 178. Sequence 178 (VWSFGILLWE) returned 250 proteins of many different species but there were only eight viral proteins present, all from viruses that have been known to cause cancer albeit in non-human hosts e.g. mice, chickens. Forty human proteins were returned; the majority having some association with cancer.
A 12-amino-acid sequence (SDVWSFGILLWE) may form part of an onco-motif.
Supporting Material

Figure 3 in an Excel sheet. This is the larger table than that which is shown in the paper; with some references and links to cancer shown, all of which are mentioned in the datasheets of each protein.

The eight viral proteins and the PGFRB_HUMAN protein sequences

The forty human and eight viral protein sequences

If you notice any sort of error on this website I would appreciate it if you would let me know - Thank You

This site uses cookies. Please allow the adverts to be displayed as these may help fund this site.