Analysis of biological data has many challenges due to its size and complexity and the diversity of tools and algorithms required to process and interpret it. There are three major bioinformatics strategic projects in progress led by the University of the Western Cape (UWC), the University of Cape Town (UCT) and Stellenbosch University (SU).

Implementing a platform for tuberculosis surveillance in Africa (UWC)

The use of omics technologies and public health initiatives in Africa has given insights into the dynamics of tuberculosis infection. These approaches ultimately need to inform the roll-out of cost-effective diagnostic technologies and health interventions, yet there are no data analytics platforms in Africa that allow researchers to scale their analyses at the site where data is generated.

This project aims to harness cloud-based and metadata-aware technologies to facilitate distribution of algorithms and storage of omics data for access to and use of data and protocols by researchers in South Africa, Ghana, Uganda and Zimbabwe.

To date we have :

  • Implemented scalable CEPH storage for storing biological data.
  • Designed a Neo4J database for storing bacterial omics data  
  • Provide a scalable database to accommodate biological and clinical information.

Ongoing Research:

  • To provide a platform that allows rapid analysis of tuberculosis data at data-generating sites using an OpenStack Platform


Implementing an imputation service for the analysis of African human genetic data (UCT)

As part of the Human Heredity and Health in Africa (H3Africa) project, researchers have designed a new genotyping array that is customised for African populations. This is accompanied by a reference panel that is more appropriate for African genetic data  than other available panels.

This project uses the Ilifu computing and storage facilities to run imputations for data generated on the new genotyping array using theAfrican reference panel. H3Africa collaborators are able to use this service for their array data. If resources allow, we can then selectively open this up to other users to enable them to do imputation using the reference panel.

The goals of the project are to:

  • Develop and implement a single nucleotide polymorphism imputation service using an African reference panel.
  • Use this tool for H3Africa projects to analyse genetic data generated by the H3Africa genotyping array.
  • Provide the tool as a service for other groups to do imputation using the African reference panel.


Omics computation for precision medicine (SU)

Precision medicine is the customisation of healthcare to individual patients. Omics data, such as genomics and metabolomics data, has great promise for implementing precision medicine and is already being used to predict the outcomes of treatments in pharmacogenomics and cancer treatment.

The current treatment regimen for tuberculosis is a compromise between overtreatment of the subset of cases that are cured quickly and undertreatment of those who are either not cured or have another episode within a year or two after the end of treatment. The long duration of treatment and the side effects of the drugs increase the probability of non-adherence and incomplete treatment, which compounds the problem of drug-resistant strains. It is thus urgent to find treatment modes that are effective, but only for as long as necessary.

Molecular or omic techniques can be used to determine drug resistance in much less time than standard culture procedures and can guide adaptation of treatment regimens.

The goals of the project are to:

  • Develop computational pipelines using omics data for a pilot project to predict risk for poor or favourable outcomes to tuberculosis treatment.
  • Use pipelines on existing data sets of host transcriptomics and metabolomics, and pathogen whole-genome sequences.
  • Expand to accommodate additional omics data types.