1. Recombination in modular biosynthetic clusters
Polyketide synthesis (PKS) and non-ribosomal peptide synthesis (NRPS) clusters often have a modular structure in which each successive extension reaction is carried out by a different module, which contains all the domains needed for the extension step. The modules are homologous, which means that some of them have DNA sequences similar enough to allow homologous recombination. An in silico model of homologous recombination was developed and the recombination potential between pairs of clusters examined. The program also predicts whether recombinant cluster will synthesise a new product and the chemical nature of the product is predicted. These analyses indicate that there are relatively few sites that are suitable for homologous recombination, which places constraints on recombinants for biotechnological purposes as well as evolutionary constraints on modular clusters. Selection systems are being developed to facilitate the in vivo generation of predicted recombinants.
2. Annotation of modular biosynthetic clusters
The rapid growth in the amount of DNA sequencing data causes problems in analysing the data to find interesting sequences. In particular, it is difficult to bridge the gap between the bioinformatics analyses and the biologist or chemist as user of the information. In cooperation with Prof. Dr. Daslav Hranueli (University of Zagreb) we have implemented a semi-automatic annotation system (ClustScan) for modular biosynthetic clusters, in particular PKS. The key to the implementation is the development of an XML-format to describe clusters in a biologically and chemically meaningful way. After automatic identification of protein domains using HMM-profiles, recent information about domain specificity is used to generate predictions of the chemical product with full freedom for user editing to input any extra knowledge available. This greatly speeds up analysis of clusters and avoids mistakes inevitably generated by manual analysis. The analysis makes it easy to recognise sequencing mistakes resulting in apparent frame-shifts or stop codons as well as allowing the recognition of unexpected introns in eukaryotic sequences. The approach will be extended to a full range of genes by further development of the XML-description.
3. UV-protection in Cnidaria
Cnidaria produce mycosporine-like amino acids (MAA) to protect against UV exposure. MAAs are probably produced by the shikimate pathway, that is not present in animals. The genome sequence of the sea anemone Nematostella vectensis was recently published. A bioinformatics analysis of shikimate genes was undertaken in cooperation with groups from Australia, Croatia, UK and USA. This revealed the presence of two shikimate-related genes in the N. vectensis genome, which had undergone horizontal gene transfer from bacteria. Surprisingly the genome sequence was also "contaminated" by the genome of a bacterium. It seems likely that this is a symbiont and experiments are being carried out to clarify this.