Transgene integration, organization and expression
A. Kohli, X. Fu, R. M. Twyman and P. Christou
Molecular Biotechnology Unit, John Innes Center, Colney Lane, Norwich NR4 7UH, UK

     Transgene expression in plants is highly variable, even among plants independently transformed with the same construct There is also no guarantee that primary transformants showing strong transgene expression will give rise to progeny with the same characteristics. The production of transgenic plants with stable, high-level transgene expression is important for the success of crop improvement programs based on genetic engineering. Many factors may be responsible for variable transgene expression, including the tendency for exogenous DNA to undergo rearrangement prior to integration, position effects, the effects of transgene copy number, and the effects of DNA methylatjon (Meyer 1998). It is therefore important to learn as much as possible about the mechanisms of transgene integration, and how transgene structure and organization affects expression and stability. In this communication, we discuss recent experiments investigating mechanisms of transgene integration and rearrangement, and the effect of such rearrangements on transgene expression.
     Perhaps one of the most surprising aspects of particle bombardment..mediated transformation is that genes introduced on separate plasmids are just as likely to cointegrate (integrate at the same locus) as genes actually linked on the same vector. We have begun to unravel the molecular basis of this phenomenon through a detailed examination of transgene structure at multiple copy transgenic loci. Such studies revealed an unexpected multiple-tier organization (Fig. 1), in which clusters of transgene copies apparently joined end-to-end were interspersed with regions of genomic DNA (Kohli eta!. 1998). We propose that such structures result from a two-phase integration mechanism. In the first phase, occurring before integration, exogenous DNA becomes ligated together to form transgene arrays lacking genomic DNA. These arrays are the substrates for integration, presumably by interacting whh randomly-occurrjng breaks in the endogenous chromosomes. While this process could be repeated throughout the genome, resulting in many unlinked transgenic loci, further transgene arrays instead tend to integrate at the same locus. The reason for such clustering may reflect the recruitment of DNA repair complexes to the original site of integration, resulting in the introduction of many local double strand DNA breaks. Strand exchange and recombinatjon at such loci could also result in the elimination of variable-length regions of host DNA at the integration site, another common observation in lransgenic plants. FISH analysis has revealed a further level of organization where such clustered transgene arrays are interspersed with larger regions of genomic DNA. This may reflect localized damage, caused by the metal particles, to DNA in a specific region of the nucleus, where tertiary structure brings looped DNA strands into close physical proximity. Notwithstanding this hierarchical organization, the individual transgene copies are sufficiently close together so that rice transformation by particle bombardment generally produces a single transgenic locus.
     We have also carried out detailed investigations of the molecular mechanisms underpinning transgene rearrangements (Kohli et a!. 1999). Within each of the transgene arrays that characterize the typical transgenic locus, individual transgenes are joined together at sites that can be termed plasmid-plasnud junctions. The nature of such junctions is poorly understood, especially in monocots, and their investigation could show how plasmids undergo rearrangement prior to integration and how this affects transgene expression. Ultimately, information derived from such studies could lead to the design of better transformation vectors.
     We analyzed 12 independent transgenic rice lines, each containing several plasmidplasmid junctions. One of the most striking revelations was the involvement of the same region of the plasmid in more than one third of the rearrangements characterized. Specifically, a 19-bp palindromic sequence surrounding the TATA box of the CaMV 35S promoter was often involved in recombination events. Notably, this represented the region that, in the wild type cauliflower mosaic virus RNA, is responsible for viral recombination events in planta. Almost all of the junctions appeared to have arisen by microhomology mediated illegitimate recombination (Table 1), a ubiquitous process in which short complementary tails from two non-homologous DNA duplexes first overlap, followed by repair synthesis to join the parental molecules together. In a few rare cases, the junctions appeared to have been generated by the direct end-to-end ligation of the two recombination partners, i.e. there was no sequence overlap between the strands. The presence of direct repeats flanking some junctions suggested the involvement of a transposition-like process, perhaps the utilization of exogenous plasrnid DNA as an illegitimate substrate by endogenous transposases. Other characteristics of the junctions included the presence of filler DNA (several nucleotides inserted at the junction, with no homology to either of the recombining partners), the presence of topoisomerase binding sites (suggesting topoisomerases, which introduce nicks and breaks in plasmids to relieve supercoiling, and remove knots and catenated links, may have been responsible for some of the strand breaks that preceded junction formation) and purine-rich tracts (suggesting the involvement of DNA tertiary structure in junction formation). Overall, the junctions generated during particle bombardment were similar to those described for other transformation procedures and indicated common, underlying mechanisms of transgene rearrangement regardless of transformation method. Through further such studies, transformation vectors could be optimized on the principle of avoiding recombinogenic sequences within transgene expression cassettes and deliberately including them in external regions. This would promote favorable recombination events during the preintegrative phase and avoid destructive transgene rearrangements.

Kohli, A., M. Leech, P. Vain, D.A. Laurie and P. Christou, 1998. Transgene organization in rice engineered through direct DNA transfer supports a two-phase integration mechanism mediated by the establishment of integration hot-spots. Proc. Nati. Acad. Sci. USA. 95: 7203-7208.

Kohli A., S. Griffiths, N. Palacios, R.M. Twyman, P. Vain, D.A. Laurie and P. Christou, 1999. Molecular
characterisation of transforming plasmid rearrangements in transgenic rice reveals a recombination hotspot in the CaMV 35S promoter and confirms the predominance of microhomology-mediated recombination. Plant J. 17: 591-601.
Meyer, P., 1998. Stabilities and instabilities in tranagene expression. in Trausgenic Plant Research, Lindsey, K. (ed.). Harwood Academic Publishers, Switzerland, p 263-275.