I N V E S T I G AT I N G T H E P L A S M O D I U M I N VA D O M E U S I N G
H I G H - T H R O U G H P U T R E V E R S E G E N E T I C S
theodore richmond sanderson
Clare College, University of Cambridge
This dissertation is submitted for the degree of Doctor of Philosophy
September 2015
Theo Sanderson: Investigating the Plasmodium invadome using high-throughput
reverse genetics, September 2015
This dissertation is the result of my own work and includes nothing
which is the outcome of work done in collaboration except as
declared in the Preface and specified in the text.
It is not substantially the same as any that I have submitted, or, is
being concurrently submitted for a degree or diploma or other
qualification at the University of Cambridge or any other University
or similar institution except as declared in the Preface and specified
in the text. I further state that no substantial part of my dissertation
has already been submitted, or, is being concurrently submitted for
any such degree, diploma or other qualification at the University of
Cambridge or any other University of similar institution except as
declared in the Preface and specified in the text.
It does not exceed the prescribed word limit for the relevant Degree
Committee.
A BS T R A C T
The invasion of host erythrocytes by Plasmodium merozoites is an es-
sential event in both the parasite life cycle and in malaria pathogen-
esis. Many Plasmodium genes have been identified that play a role
in this process, but bioinformatic analysis suggests that much larger
numbers of as yet uncharacterised proteins may also be involved.
I set out to use the PlasmoGEM reverse-genetic approaches that
are being developed at the Wellcome Trust Sanger Institute to sys-
tematically investigate genes that have been associated with erythro-
cyte invasion. I first conducted a screen of 145 putative invasion-
related genes in the rodent malaria model Plasmodium berghei, using
a recently developed barcode-sequencing approach. Parasites were
transfected with a pool of high efficiency knock-out constructs, each
tagged with a unique barcode. The quantities of different barcodes
in the population were counted over the course of the infection over
the course of the infection, and these data were used to identify the
viability and relative fitness of knock-out parasites for each gene in
this panel. This work tripled the number of putative invasion-related
genes that have been investigated genetically, and I demonstrated the
viability of knock-outs for three genes previously described as essen-
tial.
I then explored approaches to adapting PlasmoGEM methodology,
which is currently only in use in rodent Plasmodium systems, to a
human-adapted line of the zoonotic parasite Plasmodium knowlesi that
can grow in human erythrocytes in vitro. I developed the vectors
needed to use PlasmoGEM technology in this system, and tested a
number of constructs generated to target putative invasion-related
genes. I was able to use these linear vectors both to knock-out and
to C-terminally tag genes in this species, which opens up potential
for future scale-up of PlasmoGEM in a human Plasmodium species for
the first time.
Finally I developed a number of in silico approaches to analyse
the wealth of data developed by these approaches at scale. Knock-
out phenotypes were investigated in the context of a previous pu-
v
tative protein-protein interaction network and a correlation between
the phenotypes of connected genes was observed. In addition, a new
online database was created to facilitate the collation of mutant phe-
notype data in a systematic and machine readable fashion.
vi
C O N T E N T S
i background to the project 1
1 introduction 3
1.1 Current control measures 5
1.2 Parasite life-cycle 7
1.3 Species of malaria parasite infecting humans 9
1.4 Parasite species in the lab 14
1.5 New sequencing technologies enable new approaches 19
1.6 The biology of invasion 20
1.7 Approaches to studying invasion 24
1.8 Experimental genetics in Plasmodium 29
1.9 The PlasmoGEM project 34
1.10 Approaches to enhance transfection further 39
1.11 Aims and objectives 44
2 materials and methods 45
2.1 Production of DNA vectors 45
2.2 Plasmodium knowlesi in vitro culture 50
2.3 Plasmodium berghei 58
2.4 Data analysis 61
ii results 63
3 initial investigations: gene selection and p. berghei
screen 65
3.1 Assessing the Hu et al. invadome 65
3.2 Selecting the core invadome 68
3.3 Barseq in P. berghei 71
3.4 Barseq growth phenotypes 77
3.5 Assessment of barseq phenotypes en-masse 93
3.6 Conclusions 97
4 the transfer of plasmogem technology to p. knowlesi 99
4.1 Introduction 99
4.2 System development 100
4.3 Preparation of PlasmoGEM vectors 114
4.4 Plasmodium knowlesi gene targeting 116
4.5 Transfection of PlasmoGEM vectors 122
4.6 Towards CRISPR/Cas9 in P. knowlesi 139
4.7 Generation of Cas9 mother vectors 143
4.8 Enhancing pJazz transfections with Crispr/Cas9 143
4.9 Discussion 146
5 new approaches and tools for large scale phe-
notypic analysis in silico 151
5.1 Barseq in P. berghei 151
ix
x contents
5.2 Network-based analysis of the invadome 151
5.3 Large scale analysis of mutant phenotypes 154
5.4 Building the infrastructure to collate reverse genetic
data across species 165
5.5 Applying PhenoPlasm data to invasion 170
5.6 Discussion 172
iii discussion 177
6 general discussion 179
6.1 Achievements 179
6.2 Screening limitations 181
6.3 Extensions of work in P. berghei 185
6.4 Extensions of work in P. knowlesi 187
6.5 Final thoughts 190
bibliography 190
iv appendix 223
a tables 225
a.1 PlasmoGEM vectors used in this work 225
b supplementary figures 229
c raw barseq data 233
L I S T O F F I G U R E S
Figure 1 The decline in malaria deaths over the last cen-
tury 4
Figure 2 Distribution of gene descriptions in the P. falci-
parum genome 5
Figure 3 The life-cycle of the malaria parasite 8
Figure 4 Phylogenetic tree illustrating the evolutionary
relationship between Plasmodium species 10
Figure 5 Long-tailed macaques, Macaca fascicularis 12
Figure 6 P. knowlesi growth curves after episomal and
integrative transfections 18
Figure 7 The key steps of merozoite invasion 21
Figure 8 Schematic of a merozoite 22
Figure 9 Model for the motor complex involved in inva-
sion 26
Figure 10 Construction of PlasmoINT network 29
Figure 11 Schematic of ends-in integration 30
Figure 12 Synthesis-dependent strand annealing 32
Figure 13 PlasmoGEM procedure for generating targeting
vectors 36
Figure 14 PlasmoGEM procedure for generating targeting
vectors 36
Figure 15 Schematic to explain barcode-sequencing 38
Figure 16 Schematic of CRISPR/Cas9 43
Figure 17 Giemsa-stained smears from the stages of the
tight synchronisation protocol 52
Figure 18 Histogram showing the peak expression of all
Plasmodium falciparum genes and the subset that
are part of the invadome 66
Figure 19 Venn diagram of Plasmodium genes showing
1:1:1 orthology. 69
Figure 20 Phenotypes for core invadome genes as already
known from RMGMDB 70
Figure 21 Gene abundances in a single mouse for the first
invadome barseq experiment 72
Figure 22 Fold-changes of all genes in barseq experiment
1 74
Figure 23 Fold-change on successive days in barseq ex-
periments 75
Figure 24 Fold-change values for the six control constructs
are as predicted 76
xi
xii List of Figures
Figure 25 Strand-specific AP2-O transcription for AP2-O
locus 86
Figure 26 Individual knock-outs of three genes confirms
barseq results which contradict previous stud-
ies 88
Figure 27 Comparison of barseq phenotypes with previ-
ous RMGMdb phenotypes 89
Figure 28 Barseq phenotypes of rhoptry-localised genes 90
Figure 29 Barseq phenotypes of IMC-localised genes 91
Figure 30 Barseq phenotypes of miscellaneous other genes 92
Figure 31 Proportion of various phenotypes in the ocore
invadome barseq 93
Figure 32 Known phenotypes 97
Figure 33 The effect of different sera on P. knowlesi growth 101
Figure 34 The effect of atmosphere on P. knowlesi growth
over 48 hours. 103
Figure 35 The effect of shaking vs. static culture on P.
knowlesi growth 104
Figure 36 Luciferase assays demonstrate high efficiency
of P. knowlesi transfection 106
Figure 37 Effect of DNA quantity on transfection efficiency 107
Figure 38 GFP expressing parasites are seeen one day af-
ter P. knowlesi transfection 108
Figure 39 The effect of electroporation on parasite mor-
phology 109
Figure 40 P. knowlesi is sensitive to DSM1, pyrimethamine
and G418, but with IC
50
values spanning four
orders of magnitude. 113
Figure 41 Crystal structiure of PfDHODH 114
Figure 42 Premature stop codon in an ApiAP2 gene in
A1.H1 isolates 115
Figure 43 Segmental duplication in my population of the
A1.H1 clone 116
Figure 44 Possibilities for library expansion 117
Figure 45 Phylogenetic analysis of sequences homologous
to PKNH_0941100 119
Figure 46 Phylogenetic analysis of PKNH_1207600 orthologs
in Plasmodium 121
Figure 47 Phylogenetic analysis of DHHC3 orthologs in
Plasmodium and T. gondii 122
Figure 48 Positive integration PCR results 123
Figure 49 Maps of double crossover integration strate-
gies using PlasmoGEM vectors both for knock-
outs and tagging in P. knowlesi 124
Figure 50 An example of the genotyping data produced
by qPCR. 127
List of Figures xiii
Figure 51 qPCR results reveal four modifications which
are dominant in their populations after trans-
fection. 128
Figure 52 Immunofluorescence of 3xHA-tagged apical sushi
protein in P. knowlesi reveals the expected api-
cal localisation. 130
Figure 53 Alignment showing the C-terminus of PkDHHC3 131
Figure 54 Elevated dhfr copy number in non-targeting
transfections 133
Figure 55 Quantitative PCR confirms attempt to dilution
clone PKNH_0941100 knock out. 134
Figure 56 Whole genome sequencing reveals successful
gene deletion at the PKNH_0941100 locus 135
Figure 57 Whole-genome sequencing reveals a failure to
delete PKNH_1207600 (highlighted in red) 136
Figure 58 Whole genome sequencing reveals failure to
target DHHC3. 137
Figure 59 Coverage of selection cassette as compared to
the general genome in whole genome-sequenced
samples. 138
Figure 60 pJAZZ arms 139
Figure 61 Distribution of on-target cutting scores for all
possible guide RNAs in the P. knowlesi genome 141
Figure 62 Distribution of predicted CRISPR on-target cut-
ting scores compared to mammalian systems
and analysed on a per-gene basis indicates a
large potential for CRISPR in P. knowlesi 142
Figure 63 Position of Crispr/Cas9 cutting site within PKNH_0941100,
a gene already known to tolerate deletion in P.
knowlesi from work earlier in this chapter. 145
Figure 64 Decreased wild-type presence in Crispr-assisted
pJAZZ transfection 145
Figure 65 Possible trajectories for a parasite after trans-
fection with a linear vector 146
Figure 66 Selecting the 90% confidence subnetwork re-
sults in a network with relatively low connec-
tivity for most genes. 152
Figure 67 Invasion sub-network in context of entire 90%
confidence network 153
Figure 68 Separation by localisation in the invadome sub-
network 154
Figure 69 Testing centrality-lethality in Plasmodium inva-
sion genes 155
Figure 70 Analysing the invadome subnetwork accord-
ing to mutant phenotype 157
Figure 71 The core invadome coloured by P. berghei barseq
mutant phenotypes, and labelled 158
Figure 72 Association plot demonstrates that connected
nodes tend to have the same mutant pheno-
type 160
Figure 73 Essentiality ratios for invadome genes clustered
by barseq mutant phenotype 161
Figure 74 Degree of knock-down in AP2-O knockout (in
published ookinete data) compared to the same
gene’s individual mutant phenotype 164
Figure 75 PhenoPlasm database schema 166
Figure 76 Phenoplasm screenshot 167
Figure 77 PhenoPlasm single gene view 168
Figure 78 Interface for data addition to PhenoPlasm 168
Figure 79 Supplementing invadome barseq data with Pheno-
Plasm database phenotypes 171
Figure S1 AP2-O expression levels by stage 229
Figure S2 Map of Crispr/Cas9 vector 230
Figure S3 Barseq and phenoplasm connectedness analy-
sis 231
L I S T O F TAB L E S
Table 1 Comparison of transfectability of the three par-
asite species analysed in this work 41
Table 2 Annealing reaction for the creation of dsDNA
inserts encoding guide RNAs. 49
Table 3 Recipe for P. knowlesi culture medium 50
Table 4 Barseq growth phenotypes for invadome genes 78
Table 5 Genes for which barseq phenotype agrees with
previous data 84
Table 6 Calculated IC
50
s in P. knowlesi for three drugs 111
Table 7 Targeting constructs available for initial P. knowlesi
experiments 118
Table 8 Result of pJAZZ transfection attempts 128
Table 9 Summary of results for three genes for which
knock-out attempts have been made in three
Plasmodium species 132
Table 10 Contingency table showing network edges grouped
by mutant phenotypes at either node 159
Table 11 Mutant phenotypes for invasion genes predicted
based on their essentiality ratios. 162
xiv
List of Tables xv
Table 12 Collated mutant phenotypes for Hu et al. in-
vasion genes which were not included in the
1:1:1 core invadome 173
Table S2 DNA oligos used in these studies. Note that
other oligos for PlasmoGEM vectors are not
listed since these are displayed on the Plasmo-
GEM website. 228
Part I
B A C K G R O U N D TO T H E P R O J E C T
1
I N T R O D U C T I O N
The battle between humanity and malaria parasites has raged for mil-
lennia. Early skirmishes are recorded in the human genome, with the
acquisition 5 10,000 years ago of alleles [114, 90] causing glucose-
6-phosphate deficiency, alpha thalessemia and sickle cell anaemia.
The diseases these variants cause, in homozygous form, are collateral
damage, representing the overwhelming selective pressure of malar-
ial deaths [114] that has shaped human evolution since the emergence
of agricultural societies 10,000 years ago. [101]
The past century has seen a reversal of fortune, with series of de-
feats for the parasite enabled by two key factors. The discovery of the
parasite lifecycle (section 1.2) led to the draining of swamps and use
of insecticides to reduce populations of the vector. This, combined
with the development of a succession of cheap and effective drugs,
has allowed the elimination of malaria in ninety-three countries since
1900 and a corresponding reduction in deaths (Fig. 1).
But parasites have shown an alarming propensity to develop re-
sistance to any drug thrown at them. Resistance to chloroquine was
first observed in Thailand in 1957. It had reached Africa by the 1970s;
where it was the likely cause of increased mortality. [179] Resistance
to sulfadoxine-pyrimethamine developed in the late 1970s, and mefloquine-
resistant parasites appeared in the mid 1990s. [200] The emergence of
resistance to these drugs frequently followed the same pattern, with
parasites appearing first in Southeast Asia, followed by a spread of
the resistant alleles across the planet.
With the emergence of artemisinin-resistant parasites in Cambodia,
[51] there is an urgent need for new drugs, and ideally for the de-
velopment of an effective vaccine. To date, the most potent vaccines
show only modest reductions in morbidity. [168] As in many fields,
malarial drug development is shifting to a target-based approach, in
which drugs are specifically developed towards promising parasite
proteins.
The most obvious prerequisite for a drug target is that its gene be
essential—as we will see in the biology of invasion, parasites often
3
4 introduction
0
100
200
300
1900 1930 1950 1970 1990 1997 2013
Year
Annual malaria mortality per 100,000
Geographic region
World excl. SSA
World
Sub−Saharan Africa
Chloroquine developed
First chloroquine resistance
ACTs
Figure 1: Malaria mortality has declined sharply in the last century but remains
high in Sub-Saharan Africa.
The decline of malaria seen between 1930 and 1950 was the result
of many efforts, but the development of chloroquine was a major
factor. The rise in drug resistant parasites from 1954 onwards is
correlated with a resurgence of malarial deaths, warning that the
recent inroads made against the parasite [149] could be rapidly
eroded by resistance to artemisinin. (ACTs: Artemisinin Combina-
tion Therapies, SSA: Sub-Saharan Africa)
1
have many redundant pathways for achieving a particular biological
process. Beyond this, it is important to understand the role the pro-
tein plays in parasite biology and how likely resistance is to develop.
Identifying and prioritising targets will require detailed insights into
parasite biology at the protein level, but our understanding of the
Plasmodium biology is far from complete. The most common anno-
tation for P. falciparum genes is “conserved Plasmodium protein, un-
known function”, which is the sole description for around 30% of
genes (Fig. 2). Because the apicomplexans diverged from other eu-
karyotes very early in evolutionary history many genes have no ho-
mologs in other systems, and it will be a long time before we under-
stand the role that some of these play in parasite biology.
One particularly important event in the parasite life-cycle, the full
details of which are still to be elucidated, is erythrocyte invasion—the
process by which the parasite enters a new host red blood cell. This
process is considered an important target for drug and vaccine de-
1 1900-1997 data from WHO World Health Report, 1999. I calculated 2013 figures from
WHO statistics using World Bank population data.
1.1 current control measures 5
conserved Plasmodium protein, unknown function
rifin,PIR protein (RIF)
conserved Plasmodium membrane protein, unknown function
conserved protein, unknown function
unspecified product
erythrocyte membrane protein 1, PfEMP1 (VAR)
Plasmodium exported protein, unknown function
RNA−binding protein, putative
zinc finger protein, putative
stevor,PIR protein
DnaJ protein, putative
rifin, pseudogene,PIR protein, pseudogene (RIF)
probable protein, unknown function
Others
Figure 2: A large proportion of the genes in the Plasmodium genome have entirely
unknown functions.
This chart shows the distribution of P. falciparum genes with the
description provided by PlasmoDB. The ’others’ slice represents
descriptions applying to fewer than 15 genes.[157]
velopment. [41, 104] While the roles of some invasion-related genes
have been outlined, [42] there are hints that there may be far more
involved [92] and this project aims to analyse some of these genes at
scale to better understand their functions and importance.
Somewhere in the milieu of ’conserved Plasmodium proteins, un-
known function’, may lie an invasion-related drug target that will one
day quell the rise of multi-drug resistant parasites, and so it is impor-
tant to gain more data on these proteins. Such useful data includes
mutant phenotypes, protein localisations and genetic interactions.
1.1 current control measures
1.1.1 Prophylaxis and vector control
One of the most cost-effective means of malaria control has been the
prevention of bites by long-lasting insecticide-treated mosquito nets
and indoor residual spraying. [81] However this approach shows no
signs of entirely eliminating transmission, and its success is threat-
ened by increasing levels of pyrethroid resistance. [8]
A number of chemoprophylactic agents exist for travellers visiting
malaria-endemic countries, but attempts to prescribe these on long
timescales result in low rates of compliance, [31, 32] whilst some
6 introduction
cause side-effects and resistance inevitably develops. [31, 131] Nev-
ertheless they may have important potential roles in the protection of
children and pregnant women. [81]
1.1.2 Treatments
Once malaria is suspected in a patient the current frontline treat-
ment is one of the artemesinin combination therapies. These partner
a rapidly-acting artemisinin compound (which has a short clearance
time in the body) with a longer-lasting drug which can clear any re-
maining parasites.
These drugs are highly effective, and their roll-out has helped to
reduce the number of deaths due to malaria over the last decade. But
the emergence of resistance has the potential to threaten these devel-
opments. ’Slow-clearing’ parasites have arisen in Western Cambodia
with mutations that allow them to evade clearance under artemesinin
treatment and to produce recrudescent infection. [182] Such parasites
now give rise to treatment failure under a number of drug regimens.
If the parasite alleles causing these effects spread in Africa, a signifi-
cant increase in mortality seems likely.
1.1.3 Vaccines
There is no vaccine currently available against malaria which shows
both cost-effectiveness and good efficacy. The most developed tar-
geted vaccine to date is the RTS,S subunit vaccine based on the cir-
cumsporozoite protein (CSP), which demonstrated 36% efficacy in
reducing episodes of malaria in children but no significant effect on
mortality. [168] A number of other vaccines are under development.
Many of these are targeted to various components of the parasite’s
invasion machinery.
An alternative approach is the intravenous injection of irradiated
whole sporozoites—the stage of the parasite injected by mosquitoes.
In early trials this method provided highly effective protection against
subsequent P. falciparum challenge, [172] but numerous obstacles exist
to its roll-out in endemic settings including the need for manual dis-
1.2 parasite life-cycle 7
sections of thousands of mosquitoes and a liquid-nitrogen cold-chain
distribution system. [69]
1.2 parasite life-cycle
Like many parasites, Plasmodium has an exquisitely complex life-cycle
reliant on more than one host. During this process the single-celled
eukaryote transforms itself into many and varied forms, and journeys
large distances through both an insect and a vertebrate.
1.2.1 In the human host
As a malaria-infected Anopheles mosquito begins to probe for blood
vessels (Fig. 3), it injects saliva containing vasodilators and anticoagu-
lants. Plasmodium hijacks this process to transmit sporozoites, present
in the salivary glands, into a new host. In each blood meal, 15-123
motile sporozoites are injected into the tissue of the skin. [159] These
parasites find their way into blood vessels and are carried in the cir-
ulation to the liver sinusoid. Here, they traverse the sinusoidal cell
layer to infect hepatocytes.
Once inside a hepatocyte, hidden from the immune system, a sin-
gle sporozoite divides again and again over 2-16 days (depending on
parasite species) to produce thousands of invasive merozoites. [159]
During this process the parasite induces the death of the host cell
and causes it to detach. The detached cell extrudes merosomes, pack-
ets of merozoites which bud off and enter the bloodstream. In the
blood the merosomes rupture, releasing merozoites which can invade
erythrocytes. [185]
Upon entering an erythrocyte, a parasite initially assumes a ‘ring’-
like morphology—the parasite nucleus appears as the jewel on a ring
of cytoplasm, when Giemsa-stained. The parasite then begins a 2472
hour maturation process (depending on parasite species) culminating
in the release of new merozoites.
After a period in the ring stage, the parasite becomes a trophozoite
as it digests the cell’s haemoglobin, forming crystals of haemozoin
in the food vacuole. It then develops into a schizont as the genome is
replicated and up to 32 daughter merozoites produced.
8 introduction
1. Sporozoites injected
and enter bloodstream
2. Many rounds of
replication in the liver
Ring
Trophozoite
Schizont
Egress
Invasion
4. Gametocyte
formation
5. Activation
6. Fertilisation
7. Ookinete traversal
Ookinete
Gut wall
Salivary gland
8. Oocyte forms
and matures
9. Sporozoites invade salivary gland
3
.
I
n
t
r
a
e
r
y
t
h
r
o
y
t
i
c
c
y
c
l
e
Gut wall
Figure 3: The Plasmodium life cycle
This schematic outlines the major stages of the parasite lifecycle.
Plasmodium parasites transition between a multitude of lifestages
as they traverse the variety of environments in the different organs
of their hosts.
1.3 species of malaria parasite infecting humans 9
Once this development is complete the mature schizont can begin
an organised process of egress in which first the parasitophorous vac-
uole membrane, then the erythrocyte membrane, is broken down and
merozoites are released into the bloodstream to begin a new round
of invasion.
While this intra-erythrocytic cycle can continue indefinitely, a small
proportion of infected cells undergo a developmental switch, and be-
gin to express the transcription factor AP2-G. [178, 102] These cells
develop to schizonts in typical fashion, but each merozoite released
is committed not to continue the asexual cycle but to eventually be-
come a gametocyte. It will invade a new red blood cell, and form a
trophozoite, but then not into a schizont but an elongated gametocyte.
These sexual stages have two distinct morphologies, microgametocytes
and macrogametocytes, or male and female respectively.
1.2.2 In the mosquito vector
When these gametocytes are subsequently taken up by a mosquito
as part of its blood meal, the environment of the mosquito midgut
triggers ‘activation’. Male gametocytes exflagellate, releasing motile
gametes which can fuse with the female to form a diploid zygote.
This zygote then develops into a motile ookinete which burrows into
the midgut wall and forms an oocyst. Each oocyst grows and divides
to produce thousands of sporozoites which are ultimately released
and make their way to the salivary glands, ready to be injected into
the next human host.
1.3 species of malaria parasite infecting humans
This particular parasitic way of life, with an insect and vertebrate
host, is shared by countless different Plasmodium species. It is now
known that six species of malaria parasite cause significant numbers
of human infections.
Because Plasmodium malariae, and Plasmodium ovale (now known to
comprise two distinct species [186]), are of low prevalence and rarely
cause severe malaria, they will not be discussed in detail here. But
10 introduction
P. gallinaceum
P. berghei
P. vivax
P. cynomolgi
P. knowlesi
P. malariae
P. ovale
P. gaboni
P. reichenowi
P. falciparum
Species Host
Laverania
Figure 4: The evolutionary history of malaria parasites in different species
Parasites used in subsequent analysis are denoted by boldface.
Host pictograms denote (in order of appearance): birds, rodents,
humans, monkeys, chimpanzees. The tree illustrates the interest-
ing relationships, or lack thereof, between host and parasite phylo-
genies P. falciparum is equally closely related to P. vivax (a human
parasite) and P. berghei (a rodent parasite). The tree represents a
subset of a phylogeny of cyt b and cox1 genes by Duval et al.. [56]
to set this work in context, some detail is needed on three parasite
species.
1.3.1 Plasmodium falciparum
90% of malaria deaths occur in sub-Saharan Africa. This is enabled by
the fact that infections in this region are caused almost exclusively by
the rapacious P. falciparum, compounded by often inadequate health-
care provision. This species is a member of the Laverania, a subgenus
of ape parasites that form an outgroup to the other human malaria
parasites and to rodent parasites (Fig. 4). It itself appears to have
originated from this group in a host-switching event from a gorilla-
infecting parasite in Africa, relatively recently in evolutionary history.
[126]
1.3 species of malaria parasite infecting humans 11
Two aspects of P. falciparum’s biology make it especially deadly.
Where many other species show a marked preference for invading
reticulocytes, P. falciparum can invade red blood cells of any age al-
lowing it to produce higher parasitaemias and more severe anaemias.
Another dangerous feature of P. falciparum is its habit of sequestra-
tion. Malaria parasites can be cleared if they pass through the spleen
and so some species cause their host cells to present adhesive pro-
teins and ‘sequester within capillaries. In P. falciparum it appears that
sequestration can occur in the vasculature of the brain, where it some-
times combines with activation of the immune system to result in
cerebral malaria. This is the most severe pathology of any malaria par-
asite, which can cause coma and ultimately death. [20] Most sufferers
are children, and even those who survive can have lasting cognitive
deficits. [93]
1.3.2 Plasmodium vivax
The origin of P. vivax, both in terms of geography and ancestral host,
has been hotly contested. [29, 46, 40] One recent view holds that this
parasite too hails from Africa. [127] The recent discovery of a ‘syl-
vatic’ clade of P. vivax-like parasites in African apes has led to the
suggestion that an ancient P. vivax was originally able to infect hu-
mans, gorillas and chimpanzees in Africa.
In ancestral African populations, the selective pressure of P. vivax
morbidity triggered a loss of expression of the host receptor it re-
quires, the Duffy antigen receptor for cytokines (DARC), from ery-
throcytes. [141]
In the new model, this led to the extinction of P. vivax in humans
in Africa, but the parasite lived on in apes as the sylvatic clade and
human P. vivax today represents a population that escaped Africa,
undergoing a genetic bottleneck.
An alternative view is that the parasite emerged in Asia, [46] imply-
ing a previous monkey host for P. vivax. Many of the well-characterised
Plasmodium species related to P. vivax (such as P. knowlesi and P. cynomolgi)
infect monkeys.
Recent observations on Madagascar have challenged the view that
P. vivax is absolutely dependent upon DARC for invasion; with robust
12 introduction
Figure 5: Long-tailed macaques, Macaca fascicularis, which along with pig-
tailed macaques are the primary hosts of P. knowlesi (original pho-
tograph)
data indicating the presence of P. vivax in some Duffy-negative indi-
viduals. [139] This is just one example of how little we know about
P. vivax, despite the substantial morbidity it causes worldwide, [67]
primarily because of the intense difficulty of culturing it in vitro.
1.3.3 Plasmodium knowlesi
In contrast to the two previously mentioned human species, P. knowlesi’s
primary hosts are not humans but the macaques of Southeast Asia
(Fig. 5). All extant parasites are derived from a parasite that lived
98,000 - 478,000 years ago. This emergence is before the human settle-
ment of Southeast Asia, meaning that the parasite likely emerged in
macaques. [116]
There are disputes over who was the first person to observe P.
knowlesi microscopically, [11] but it owes its name to Robert Knowles.
He and Biraj Mohan Das Gupta described the parasite, which had
been discovered in a macaque imported from Singapore. [108] They
observed that it caused a ’quotidian’ fever in humans (temperatures
peaking once each day), rather than the tertian or quartan fevers ob-
served with other human Plasmodium. [160]
Most early data on human P. knowlesi infection comes from a pe-
riod in which the parasite was used as a pyretic to treat sufferers of
neurosyphilis. [164] In a study in Edinburgh it was passaged from
one patient to the next with decreasing potency on each occasion; but
in Bucharest, where the treatment was most widespread, after 170 se-
1.3 species of malaria parasite infecting humans 13
quential passages the parasite developed such virulence that its use
was ended. [160]
1.3.3.1 P. knowlesi causes large numbers of infections with potentially se-
vere symptoms
The first reported natural human infection with P. knowlesi was in
1965 when a ‘U.S. Army Map Surveyor’, working in the jungles of
Malaysia, developed a fever [37] though given reports that he did all
his work by night, he is more likely to have been a spy. [160] Samples
of his blood were subsequently inoculated into ‘volunteers’, result-
ing in quotidian infections and moderate to severe clinical symptoms.
Sporozoite inoculation revealed that the infection could be passed on
by mosquitoes which had fed on an infected human, proving the theo-
retical possibility of parasite transmission within human populations.
While P. knowlesi enjoyed attention as a model parasite in the 1970s
and 1980s (see later sections), there was little further discussion of
human infections until a seminal discovery in 2004.
Balbir Singh and colleagues were investigating a focus of malaria
infections in Kapit, a divison of Sarawak in Malaysian Borneo. [176]
In Kapit, a fifth of infections were then identified by Giemsa-smear
as P. malariae, but these infections did not fully resemble this typically
benign parasite species: 18.5% of patients had more than 5,000 par-
asites per µl of blood. The researchers aimed to investigate whether
these infections represented a new P. malariae variant, or an emergent
Plasmodium species. The answer turned out to be neither of these.
After PCR assays for each of the five major Plasmodium species gave
negative results, the researchers amplified the SSU rRNA gene from
these parasites and sequenced it. The sequence they found was that
of P. knowlesi, then considered almost exclusively a simian parasite.
P. knowlesi can assume a band-like morphology which is typical of P.
malariae, and this was the cause of the diagnostic confusion. Newly
designed PCR primers then revealed that 58% of malaria infections
in Kapit contained P. knowlesi parasites, and subsequent research has
shown that human P. knowlesi infections are widely distributed in
Borneo and beyond. [44, 100]
Studies have since confirmed the severity of P. knowlesi malaria,
which has a fatality rate of 1-2%. [47] Many of the symptoms of se-
14 introduction
vere malaria can be present including anaemia, jaundice, acute renal
injury, hyperparasitaemia, hyperlactaemia, hypotension and respira-
tory distress. [43]
1.3.3.2 Insights into population genetics
One key early question was whether the population of parasites in
humans represents a single clade of the P. knowlesi radiation. The
observation that most alleles found in human infections are shared
by infections of wild macaques demonstrated this was unlikely to be
the case. [116]
In a separate project we sequenced entire parasite genomes from
patients presenting with P. knowlesi and found that they clustered
into two distinct groups. [156] Dimorphic genes included the invasion
ligand NBPXa, and the alleles of this gene were implicated in changes
in parasite virulence. [2]
At the time the cause of this sympatric dimorphism was unclear,
but recent research has revealed a genomic dimorphism between par-
asites in M. fascicularis and M. nemestrina hosts, [54] and I have since
conducted in-silico PCR analyses to confirm that these reflect the two
groups we saw in patient isolates (unpublished data).
1.3.3.3 The mosquito host of P. knowlesi
The P. knowlesi parasite is transmitted by members of the Leucos-
phyrus group of Anopheles mosquito species, predominantly A. latens.
[180] The range of these species largely explains the parasite’s distri-
bution in Southeast Asia, centred around the island of Borneo. These
hosts are exophilic, rarely venturing into houses, and this explains
why most P. knowlesi patients are people who work in forests. [180, 43]
This raises the prospect that changes in land-use might drive a switch
in mosquito behaviour and an increase in P. knowlesi transmission.
1.4 parasite species in the lab
There are a number of model systems of malaria used to investi-
gate different aspects of parasite biology. It is worth considering the
strengths and weaknesses of each.
1.4 parasite species in the lab 15
1.4.1 P. falciparum
As the parasite species causing the most morbidity and deaths world-
wide, P. falciparum has long been the primary focus of malaria re-
search. Early experimental work used human volunteers, but there
were also ape models in chimpanzees and Aotus monkeys.
A significant breakthrough was achieved with the development in
1976 of a system that allowed continuous in vitro culture of the par-
asite. [194] This involves a low-oxygen atmosphere, produced using
either malaria gas or a candlejar, and media based on RPMI-1640
developed for culturing leukemia cells supplemented with serum
or its equivalents.
P. falciparum was the first parasite to have its genome sequenced
in 2002 [70], but reverse genetic studies of this parasite have lagged
behind the murine malarias. Difficulties in the experimental genetic
investigation of P. falciparum result from its very low transfection ef-
ficiency (section 2.2.6) and the need to transfect with circular DNA
which forms episomes which can only be removed by a prolonged
period of drug cycling, though recent targeted endonuclease develop-
ments may change this.
1.4.2 P. berghei
In any tropical region, many species of animal will be infected with
malaria parasites, and the thicket rats of central Africa are no excep-
tion. It is from these animals that P. berghei was isolated, but in the lab
it can infect a range of rodents including laboratory strains of mice
and rats. [192]
P. berghei has proved a powerful in vivo model in which parasites
can be transfected with linear DNA and achieve high transfection
efficiencies. Parasites grow rapidly (parasitaemia increasing tenfold
per 24-hour cycle during the logarithmic period of infection). The
model allows the study of a number of host-specific features of par-
asite disease, including cerebral malaria, [130] host nutritional status
and immunology. [112] With knock-out mice now available for large
numbers of genes the model also allows the genetic investigation
of host-parasite interaction from both sides. The RMGMdb database
16 introduction
maintained by Leiden University [117] currently lists 625 attempted
gene disruptions in P. berghei.
2
1.4.3 P. knowlesi
Although the discovery that P. knowlesi causes a significant clinical
burden is very recent, the parasite has long been of interest to para-
sitologists. Before the development of robust culture systems, animal
models were the only means to study malaria, and primate models
considered the most accurate representations of human disease. Ad-
ditionally, in parallel with the development of culture systems for P.
falciparum, it also became possible to culture P. knowlesi in vitro, using
macaque red blood cells.
Kocken et al. used both this system, and growth in animals, to
show that the parasite could be transfected with both circular and
linear DNA, and demonstrated that a number of selectable markers
could be used. [109] However there have been very few publications
in which P. knowlesi parasites are genetically modified in the macaque
context. This is in part due to the difficulty of acquiring macaque
blood with sufficient quantity and frequency for routine culture.
1.4.3.1 Adaptation to human erythrocytes
While by 2004 it was established that P. knowlesi could theoretically
replicate in human red blood cells, indeed that it could cause vast
parasitaemias in patients, attempts to culture the parasite in human
cells always failed. It was clear that parasites had some ability to
invade human cells: if donor parasites from a cynomologous culture
were incubated with human erythrocytes, some infected human cells
were seen. [142] But parasites would quickly die out if continuously
cultured in the human cells.
Moon et al. set out to establish a line of P. knowlesi that could
grow robustly in human erythrocytes. [142] The starting point for
this experiment was the A strain, derived from the H strain first
isolated from that ’surveyor in Malaysia (section 1.3.3.1) and main-
tained since then by passage in macaques.
2 This figure includes multiple attempts to disrupt the same gene
1.4 parasite species in the lab 17
Moon first confirmed that if he mixed human cells with macaque
cells in a 3:1 ratio he could maintain parasites in this hybrid blood.
The 25% of the erythrocytes which were from macaques were enough
to keep the parasites growing, albeit at a lower rate than usual, and
his hypothesis was that the presence of 75% human erythrocytes
would provide a selective pressure for the parasites to adapt to grow
inside these cells in vitro. Any parasite that developed a mutation
allowing it to do so better, even to a small extent, would have an ad-
vantage over its neighbours which could not exploit most of the cells
surrounding them.
After eight months of growing the parasites in this mixture, he
found that the resultant parasites were capable of replicating well (3–
4ˆ increase per intraerythocytic development cycle (IDC)) in human
blood, although they still grew somewhat better (4-5ˆ) in macaque
cells.
Across the Atlantic, Lim et al. had also been adapting a line of P.
knowlesi to human erythocytes, in a similar fashion. [120] In this case
they found that the changes that occured in this time expanded the
range of ages of erythrocyte that could be invaded.
However in neither case has there yet been a published mutation,
or mutations, shown to be causative in the adaptation of the parasite
to the human cells.
Moon performed experiments to establish whether the human-adapted
line was amenable to transfection. Using schizont transfection he
achieved a transfection efficiency of more than 10% on day 1 post-
transfection a better result than in any previously reported Plasmod-
ium transfection. He also showed that he could transfect with linear
DNA to achieve ends-in integration (see section 1.8.2), a feat not pos-
sible in P. falciparum.
However it is worth noting that in this integration experiment (Fig. 6)
the parasitaemia of transfectants did not reach 2% until day 14, 11
days later than in the transient transfection condition.
The parasite typically multiplies at a rate of 2.5-3.5 fold per 28
hours. 2.5
11ˆ
24
28
5649, indicating that of those parasites that take up
DNA, just 1 in 5000 may be integrating it. This is a very different re-
sult to P. berghei where the rate of formation of episomes is reported
to be similar to that for integration. [97]
18 introduction
5
1
2
3
4
5
10 15
Days post-transfection
Circular DNA
Linear DNA
Transfectant parasitaemia / %
Figure 6: Parasites come up rapidly under selection for P. knowlesi episomal trans-
fections but integrations take significantly longer.
Figure re-drawn to combine data from two figures of Moon et al.
[142]
Such differences presumably reflect differences in the DNA repair
machinery between the two organisms. But using recently developed
techniques for enhancing the efficiency of homologous recombination
(section 1.10.3.1) it might be possible to bring the rate of integrant-
formation closer to the rate of episome formation in P. knowlesi.
1.4.3.2 The rationale for genetic modification of P. knowlesi
This project involves an attempt to bring novel experimental genetic
technologies to bear on P. knowlesi. Some might question why we
should we study this parasite, when the majority of those those suf-
fering from the disease may be monkeys. There are at least 3 possible
answers to this question.
The first is, ‘for its own sake’: it is clear that P. knowlesi causes signif-
icant numbers of fatalities in Southeast Asia each year, and it is pos-
sible that ecological pressures on its simian hosts could select for its
further adaptation towards humans. In addition, it appears that the
sucess of malaria control in decreasing the numbers of P. vivax and P.
faciparum populations may be resulting in increased P. knowlesi infec-
tions. [205] These factors, and simple scientific curiosity, may lead us
to want to better understand the biology of this fascinating zoonotic
parasite.
But a second reason to study P. knowlesi is as a model for all malaria
species. Studies of rodent malaria have been crucially important in
1.5 new sequencing technologies enable new approaches 19
understanding malaria as a whole, both because these parasites are
more amenable to genetic manipulation and because they allow host-
parasite interactions to be studied in vivo. Both of these advantages
apply to P. knowlesi with the additional advantage that it can be cul-
tured in vitro in human cells, allowing the scale of experiments to
increase without a corresponding increase in animal use.
A final interesting possibility of P. knowlesi research is that it might
allow us to better understand the elusive and important P. vivax. This
species, which causes considerable morbidity, [158] cannot be cul-
tured in vitro effectively which severely limits our understanding of
its biology. As shown in Fig. 4, P. knowlesi is far more closely related to
P. vivax than is any other model (indeed, in terms of evolutionary dis-
tance, even the rodent malaria parasites would be better models for
understanding P. vivax than would be P. falciparum). And P. knowlesi
recapitulates P. vivax biology in some areas. Naturally it shows a retic-
ulocyte preference [120], and its genome has a similar (A+T)-content
to P. vivax, significantly lower than that of P. falciparum. It does not
seem that P. knowlesi produces hypozoites, the sleeper cells of P. vivax
which can lie dormant in the liver to produce a recrudescence years
later, but it is quite possible that there are many other features of P. vi-
vax biology shared with the other non-Laverania ape parasites which
we do not yet understand because we lack the ability to culture P.
vivax. P. knowlesi might shed insights on some of these areas.
1.5 new sequencing technologies enable new approaches
Many of the techniques used in this project would not have been pos-
sible a decade ago; the cost of sequencing genomes has been radically
altered by the development of what is often called "next-generation
sequencing". Where sequencing by the Sanger method identified the
nucleotides in a single stretch of DNA of about 1 kb, these technolo-
gies permit up to 4 billion short reads DNA to be sequenced on a
single flow cell
3
. [94]
While read-lengths with this technology were initially less than 50
bp, improvements to the sequencing chemistry now allow two 150bp
reads per DNA fragment sequenced. In recent years smaller scale se-
3 Illumina HiSeq 2500
20 introduction
quencing platforms such as the Illumina Miseq have been launched.
While these are more expensive per basepair sequenced, they cost
very much less per run. This has made it feasible to routinely se-
quence small numbers of parasite genomes, and also to use massively
parallel sequencing for other purposes such as barcode-sequencing,
as we will discuss.
1.5.1 Reference genomes
In 2002 the P. falciparum genome was sequenced, this resource has
proved invaluable to malaria research. The P. knowlesi genome, which
represents the H strain, passaged exclusively through macaques was
published in 2008 and there is also limited whole-genome data from
the Nuri strain. [150]
1.6 the biology of invasion
This project is focused on the very smallest developmental form of
the parasite, indeed one of the smallest eukaryotic lifeforms known,
the merozoite. P. falciparum spends less than two minutes per 48 hour
cycle 0.06% of its time in this form, [75] and yet its biology is a
focus of intense interest.
The merozoite is a specialised machine designed to overcome one
of the greatest challenges the parasite faces. Once supplies of nutri-
ents in an erythrocyte are exhausted, parasites have to leave the rela-
tive safety of the parasitophorous vacuole and enter the bloodstream
to find a new home. There they face the full force of the immune
system: a sea of antibodies, white blood cells and complement. The
merozoite must avoid this adversity and rapidly bind to and enter a
new cell.
Invasion is considered an attractive target for drug and vaccine de-
velopment. The blood stage is the only target for a malaria therapy
that can cure a patient already showing symptoms, and the merozoite
is the only stage of this cycle in which the parasite is outside a host
cell. Thus it could represent the weak link at which the intraerytro-
cytic cycle is most targetable.
1.6 the biology of invasion 21
r
e
o
r
i
e
n
t
a
t
i
o
n
deformation
tight junction
AttachmentEgress Pre-invasion Invasion Post-invasion
Dynamic `Resting’
Exonemes
Micronemes Dense granules
Rhoptries
Figure 7: The process of merozoite invasion
The key steps of invasion are shown, along with the organellar
secretion that occurs at each stage. Drawn with inspiration and
information from [42].
1.6.1 The key steps of invasion are conserved across Plasmodium
The first detailed view of Plasmodium invasion in action came, in fact,
from P. knowlesi. The relatively large size of its merozoites enabled
the first video microscopy of invasion in 1975. [57] The researchers
observed that the haemozoin in a schizont would converge to a sin-
gle point; and that soon after the merozoites would separate, and
the erythrocyte swell, then explosively rupture. The merozoites re-
leased made contact with red blood cells, which were then engulfed
by waves of deformation. A short time after these ended the parasite
moved into the erythrocyte, through a pinch-point (the ’tight junc-
tion’), to give rise to a ring parasite. It was also observed that the
apical end of the merozoite had to be facing into the erythrocyte for
invasion to occur.
Subsequently Gilson and Crabb conducted similar experiments in
P. falciparum. [75] After making their observations of invasion and
comparing their results to those seen 34 years prior in P. knowlesi,
they wrote that:
Our observations of P. falciparum indicate that the mor-
phological steps and kinetics of erythrocyte invasion are
remarkably similar to those detected in P. knowlesi. This
conservation across a large evolutionary distance suggests
that the maintenance of invasion rates, including the tim-
ing of the individual steps, has reached an optimum.
22 introduction
Nucleus
Exonemes
Dense granules
Rhoptries
Micronemes
Inner membrane
complex
Surface coat
Figure 8: Schematic of a merozoite
Major oreganelles are labelled, many of these (rhoptries, mi-
cronemes, IMC, surface coat) are intimately involved in invasion.
These approaches, and many others, have led to the model of inva-
sion we have today. (Fig. 7).
1.6.2 The anatomy of egress and invasion
The ultrastructure of the merozoite is divided into organelles which
function at different steps during the invasion process. The proteins
contained in these organelles vary in number, and in sequence, be-
tween the different Plasmodium species, and some mutate rapidly un-
der the high pressure of the immune system. Nevertheless the con-
served appearance of the process as a whole suggests that increased
understanding of the molecular biology of invasion in one species
will have ramifications for others.
As with most areas of parasite biology, the molecular details of
invasion are currently best understood in P. falciparum and so in this
section I will describe the process as it occurs in this species, and with
its gene nomenclature. Nevertheless many of the genes involved have
homologues across Plasmodium.
Before any invasion can occur, the merozoite must escape the con-
fines of its host erythrocyte. Triggering this egress process is the role
of the exonemes. [209] As a schizont matures, the calcium concentra-
tion increases causing the activation of cGMP-dependent protein ki-
nase (PKG). Ultimately PKG activation results in the secretion of the
contents of the exonemes into the parasitophorous vacuole. [38] One
crucial component of the exonemes is SUB1.
SUB1 is a serine protease that among other targets processes the
merozoite surface proteins MSP1, MSP6 and MSP7. At least some of
1.6 the biology of invasion 23
these events appear to be essential to prime the merozoite for inva-
sion. [36] SUB1 processing also activates SERA6, a protease essential
for egress. [170] The parasitophorous vacuole is now broken down,
likely by a process involving conscripted host calpain. [33] Finally
the erythrocyte membrane bursts and merozoites are released. [206]
Merozoites are explosively ejected from bursting schizonts. In the
blood they encounter low potassium levels, which cause calcium re-
lease and the discharge of the adhesins stored in the micronemes onto
the merozoite surface. [177]
When a merozoite comes into contact with an erythrocyte, its initial
contact is low-affinity and reversable. This attachment is most likely
mediated by members of the merozoite surface protein family. [42]
During this period the red blood cell ’ruffles’ with visible deforma-
tions, and the parasite reorientates until its apical end reaches the
erythrocyte surface.
An irreversible attachment is now formed, and this triggers the
rhoptries to begin to discharge, releasing the Rh proteins. These bind
to various receptors on the erythrocyte surface (Rh5 to basigin, Rh4
to complement receptor 1, and Rh1-3 to as yet unknown receptors).
Together with members of the EBL family the attachment of these
proteins appears to commit the parasite to invasion.
The next player to leave the rhoptries is the RON complex, which
is inserted into the erythrocyte. RON2 forms a complex with AMA1
which in the past been considered has been considered the interac-
tion that provides traction between the parasite and the erythrocyte,
although this is not definitively established. [16]
The last part of the rhoptry to discharge its contents is the ’bulb’,
containing the lipids and proteins required to make up the para-
sitophorous vacuole, so that as the parasite starts to invade the ery-
throcyte it is able to build its own compartment inside.
The physical force for invasion comes from another parasite or-
ganelle, the inner membrane complex (IMC). The IMC lies beneath the
parasite plasma membrane and is made up of a series of flattened
vesicles, or alveoli, which are intricately connected to the cytoskele-
ton below. [110] It is this feature that gives the Alveolata, the super-
phylum to which Plasmodium belongs, its name. Between the IMC
and the plasma membrane is an actin myosin motor which propels
the parasite into the erythrocyte and the nascent parasitophorous vac-
24 introduction
uole. The IMC forces the parasite through a visible pinch point on the
erythrocyte surface, known as the tight junction.
It is clear that the formation of strong attachments between para-
site and erythrocyte is essential for invasion, but in order for invasion
to occur it is perhaps just as important that these attachments are
cleaved—without this one might imagine the parasite would remain
stuck to the outside of the erythrocyte. This ‘shedding’ of invasion lig-
and peptides into the supernatant is mediated by proteases including
ROM1, ROM4, and SUB2.
As the IMC pulls the parasite into the red blood cell, the tight junc-
tion is the last component to enter and finally the parasitophorous
vacuole is sealed. The parasite is now becoming a ring and the dense
granules are released to make the erythrocyte resistant to further inva-
sion events and to begin remodelling the erythrocyte to the parasites’s
own ends. The most dangerous period of the parasite’s intraerythro-
cytic cycle is over and the next 48 hours of ordered metabolism and
development can begin.
1.7 approaches to studying invasion
Broadly speaking, proteins involved in invasion can be thought of as
being divided into two categories: adhesins and invasins. [42] The
former bind directly to factors on the erythrocyte whilst the latter are
involved in invasion, but in less direct ways.
1.7.1 Detecting the molecular components underlying known host-parasite
interactions
Historically, the identification of machinery involved in invasion has
often relied upon the fact that these proteins target parasites to spe-
cific subsets of cells. Whenever a difference has been observed be-
tween the ability of parasites to invade one cell type compared to
another researchers have sought to identify the protein-protein inter-
actions underlying this difference, and thus build up a picture of the
invasion process.
For example, the observation that people of African descent are
largely protected from P. vivax malaria led to the discovery of DARC
1.7 approaches to studying invasion 25
as an essential host receptor for these parasites. [141] Analysis of par-
asite proteins which bound differentially to Duffy-positive vs. Duffy-
negative blood, then identified the parasite protein involved in this
process, Duffy binding protein (DBP). [204] Meanwhile, it had been
observed that P. falciparum, which does not require DARC for inva-
sion, was unable to invade cells which had their surface sialic acid
removed by enzyme treatment. [26] Similar binding experiments to
those undertaken in P. vivax allowed the identification of the pro-
tein involved, EBA-175. Comparison of the sequences of the DBPs
of P. knowlesi and P. vivax and EBA-175 from P. falciparum identified
shared domain architecture that grouped these proteins as the found-
ing members of a family, the erythrocyte-binding like (EBL) family.
[1]
The other major family of invasion proteins, the RBLs, were first
identified in P. vivax in a very similar way. [68] Parasite proteins which
bound only to reticulocytes were identified on protein gels. Again
here, proteins with similar sequences were identified in other parasite
species, forming the reticulocyte-binding like (RBL) family [162, 140].
The expansion of the families has sometimes involved bioinformat-
ics preceding biology. For example Rh5 was identified purely on the
basis of sequence similarity to other RBLs, and then its role in inva-
sion was established, [18] and finally its host receptor identified [45].
(i.e. the mirror image of the order of events in the discovery of DBP).
It is likely that there are other parts of the invasion machinery which
have yet to be found because they do not have sequence similarity to
known invasion genes.
1.7.2 The identification of components of the actin-myosin motor has pri-
marily relied on inferences of orthology
Many of the adhesins described above differ markedly between Plas-
modium species, and they lack homologues outside Plasmodium. This
is unsurprising given that these components—which interact with the
host, and are exposed to the immune system—are likely to be under
intense selective pressure and hence to evolve rapidly in evolutionary
arms-races between parasite and host.
26 introduction
ErythrocyteParasite
Receptor
MyoA
MTIP
Actin
Adhesin
GAC
GAP45
GAP50
GAP40
Inner IMC Outer IMC Plasma membrane Plasma membrane
?
?
Figure 9: A possible model for the mechanism of force production by the motor of
the malaria parasite
A molecular motor comprising MTIP and MyoA pushes the IMC
into the merozoite, pulling against filamentous actin which is an-
chored to receptors in the erythrocyte. This anchoring may take
place through some glideosome adhesin connector (GAC) but this
is not yet definitively established. Partially adapted from [63]
It took much longer for the invasin genes of Plasmodium to be char-
acterised - the motor complex was first reported in 2006, based on
components already characterised in Toxoplasma. [99, 17] The motile
stages of Plasmodium share an actin-myosin motor complex with other
Apicomplexan parasites, and the machinery used for gliding motility
in other stages plays a key role for merozoite invasion. Almost all
invasin biology was discovered in Toxoplasma before its orthologous
constituents were identified in P. falciparum.
Our understanding of this machinery, and how it links to the ery-
throcyte is still far from complete. The protein linking the motor to
a surface adhesin was originally proposed to be aldolase, but this
has recently been questioned and a candidate to replace it identified
in Toxoplasma. [173, 95] A very speculative model of how the motor
might function is presented in Fig. 9, based on this new GAC protein,
1.7 approaches to studying invasion 27
but this model pieced together from data from various species cer-
tainly excludes some key components, and may well be fundamen-
tally flawed. Whether the glideosome adhesin connector (GAC) in
Plasmodium is homologous to that in Toxoplasma is not known. Futher-
more the adhesin that links to the GAC is not established. TRAP plays
this role in sporozoites and a merozoite homologue MTRAP is a can-
didate to do so during erythrocyte invasion but supporting data is
currrently sparse.
A recent analysis identified a second myosin motor MyoB and an
interacting light chain (PF3D7_1118700) as very likely also implicated
in invasion. [211]
1.7.3 Novel approaches may be needed to expand our understanding of
invasion-related components
Thus far, many techniques that have identified proteins involved in in-
vasion, from the binding experiments of the 1980s to AVEXIS (Avidity-
based extracellular interaction screening) experiments today, have been
biased—for very good reasons—towards detecting adhesins. Despite
an array of challenges, these are typically easier to identify: there is a
target and relatively open-ended experiments are carried out to iden-
tify what binds to it.
Experiments to discover invasins have been generally limited to in-
vestigating sequences homologous to those better understood in other
systems, but this prevents the discovery of any machinery unique to
Plasmodium.
Often experiments to directly identify invasins involve testing very
specific hypotheses—that X processes Y so that it can interact with
Z. Increasing the scale of invasin discovery will require much more
hypothesis-free approaches.
1.7.3.1 Transcriptomic data can predict protein function
One such approach was developed by Hu and colleagues in 2010
using transcriptomic data. [92] Previously researchers had tried to ex-
ploit the "just in time" nature of Plasmodium transcription to assign
functions to proteins. [22, 163] These approaches essentially treated
co-expression over the course of the intraerythrocytic development
28 introduction
cycle (IDC) as evidence of a possible common function. However
because a number of different processes of parasite biology can be
active at any point in the lifecycle these approaches did not yield
high-confidence predictions. [92]
Hu et al. aimed to add more dimensions of resolution to the tran-
scriptome by exposing parasites to 20 different compounds known to
inhibit parasite growth. They found that each compound left a unique
signal on the transcriptome of the parasites perturbed by it. The idea
here was not to investigate the compounds themselves, but to exploit
the possibility that two genes whose expressions were influenced in
a similar fashion by exposure to a particular compound might be in-
volved in the same part of parasite biology.
The researchers took 5-10 time points for each compound, giving a
total of 144 microarrays. They combined this data with IDC microar-
rays for field and lab strains and then looked for co-expression as
evidence of a functional relationship.
This co-expression evidence was then combined with data from
yeast two-hybrid experiments (which use a heterologous expression
system to detect protein-protein binding), and data on the phyloge-
netic distance between proteins and domain-domain interactions. A
Bayesian approach was used to integrate these four data sources into
a single score for any two proteins’ likelihood of being involved in an
interaction (Fig. 10).
The ‘PlasmoINT’ database of putative interactions that this analy-
sis produced has implications for many areas of parasite biology, but
the researchers chose to validate it by looking at invasion. They chose
25 proteins known to be involved in invasion and then identified the
subnetwork formed by selecting all proteins linked with > 90% confi-
dence to at least one of these.
This analysis identified a 418 gene subnetwork, a putative ‘invadome’.
The researchers successfully characterised the localisation of 42 of the
additional proteins identified. 31 of these were consistent with a role
in invasion (the remainder had mostly cytoplasmic localisations, not
indicative of invasion but not necessary excluding some role in it) ,
giving some confidence to the data. But, as we would expect from
our limited understanding of the genome (Fig. 2), 37% of these genes
have as their only annotation "conserved Plasmodium protein, un-
1.8 experimental genetics in plasmodium 29
Perturbation microarrays
Phylogenetic relationship
Yeast two-hybrid
Domain-domain interaction
gene A
gene B
gene C
gene D
gene E
gene E
gene D
gene C
gene B
gene A
--
--
95%
--
50%
--
--
--
98%
--
95%
--
--
--
--
--
98%
--
--
--
50%
--
--
--
--
Bayesian model
Figure 10: Schematic of approach taken by Hu et al. to construct the PlasmoINT
interaction network
A Bayesian approach was used to optimally combine four data
sources to infer potential protein-protein interactions.
known function". This implies that there are still many components
of the invasion machinery we have yet to understand.
1.8 experimental genetics in plasmodium
1.8.1 The DNA repair machinery available in the parasite informs the ex-
perimental genetic approaches that are possible.
That reverse genetics is possible in any eukaryotic system is the unin-
tended consequence of cellular machinery designed to repair genomes
from natural damage. Each day cellular processes and external fac-
tors such as cosmic rays create an estimated 10 double-strand breaks
(DSBs) in each human cell, and these must be efficiently repaired.
[119]
Where this repair machinery differs between organisms this af-
fects the possible strategies for reverse genetics. Crucially, Plasmodium
lacks the machinery for canonical non-homologous end joining (C-
NHEJ), and although it does possess an alternative, microhomology-
mediated, end joining mechanism [107] these events are rare. [167]
Thus Plasmodium is left mostly reliant on repairing its genome by
homologous recombination (HR), a process for which it does possess
the requisite proteins. [115] The presence of this system, and the ab-
sence of C-NHEJ, is in one sense of great assistance to malaria geneti-
cists because it means that almost the only way in which a parasite
can integrate a resistance marker is by homologous recombination,
30 introduction
Homologous region
Homologous region Homologous region
Chromosome
Chromosome
Insert
Chromosome ChromosomeInsert
Figure 11: Schematic illustrating the process of ends-in integration
With this approach the entire plasmid is copied into the genome
with the single homology region replicated.
which necessarily targets the marker to a particular place in the para-
site genome.
DNA repair is a complex process in which a number of different
mechanisms can lead to the same end result in the genome of a trans-
fected parasite. There are, however, just two possible arrangements in
which exogenous DNA can be incorporated into a parasite. These are
sometimes called single and double crossover, but these names are
not accurate because an apparent ‘double crossover event can actu-
ally occur without any crossing over. [115] The names I will use here
are ’ends-in’ and ’ends-out’, respectively.
1.8.2 Integration events fall into two main classes
In an ends-in integration event there is a single region of homology be-
tween the target genome and the targeting plasmid. Crossover occurs
somewhere in this region, and the entire plasmid is then copied into
the genome, sometimes multiple times. There are two disadvantages
to this approach as compared to ends-out integration.
Firstly a region can never be actually deleted from the genome by
this method; DNA can only be added. This doesn’t make knock-outs
impossible a critical part of a gene can be disrupted by an insertion
but it does introduce complexity.
1.8 experimental genetics in plasmodium 31
Secondly, because the entire plasmid is introduced it is never pos-
sible to use negative selection to distinguish between integrants and
parasites carrying episomes.
In ends-out recombination there are two regions of homology, either
side of the region to be replaced. The result is incorporation of the
DNA between them, in place of that region in the genome.
The limited data available suggests that the primary way in which
ends-out recombination occurs in Plasmodium is by synthesis-dependent
strand annealing (SDSA). [184]
In this mechanism (Fig. 12), a double strand break forms (spon-
taneously, or otherwise) in the genome near the site to be targeted.
There is then resection of both fragments by exonucleases which chew
back on the 5 (and to a lesser extent the 3’) ends of the break. One
or both of the free 3 strands can then invade a homologous region
of the transfected plasmid. Here it acts as a primer, to be extended
in the 3 direction by polymerases. At first a region of homology will
be copied, then the heterologous resistance marker and any tags, and
then a final region of homology. This final region provides material
which can anneal to the other side of the lesion to repair the double
strand break. Any nucleotides absent from the other strand can now
be synthesised on the basis of this completed template.
While this system exists to allow repair from fully homologous se-
quences, for example in other chromosome copies during schizogony,
it is enlisted by the geneticist to incorporate alterations and markers
into a parasite genome.
1.8.3 Genetic screens are crucial for understanding genomes at scale
Large scale genetic screens for particular phenotypes, from develop-
mental defects in Drosophila to metabolism in S. cerevisiae, have made
a more profound contribution to our understanding of model organ-
isms than any other technique. But such techniques are just begin-
ning in Plasmodium, where publications which investigate more than
50 genes are very rare.
There are two broad approaches that can be used in a genetic
screen: forward and reverse genetics.
32 introduction
Gene to be knocked out
Spontaneous or induced DSB
Resection by 5’ exonucleases...
.. and concurrently by 3’ exonucleases
Linear donor DNA, with selection marker. Strand invasion by free DNA ends.
Free DNA ends are extended copying the donor sequence.
Once homology on the other side of DSB has been copied,
strand annealing and ligation can occur.
The incomplete strand is synthesised by copying the newly-synthesised strand
Strand annealing.
Strand is extended by synthesis.
Strand invasion by just of one of the free DNA ends.
Figure 12: Synthesis-dependent strand annealing
This figure illustrates one of the processes by which ends-out in-
tegration of DNA can occur. A double-strand break forms within
the gene to be deleted, and resection by exonucleases exposes re-
gions homologous to the donor vector. Strand invasion enables
the synthesis of new genomic DNA from the donor template and
ligation repairs the nicks in the genome.
1.8 experimental genetics in plasmodium 33
A forward genetics approach is any in which the discovery of a phe-
notype occurs before the identification of the genetics underlying it.
Perhaps the best example is the discoveries of Nusslein-Volhard and
Wieschaus in Drosophila embryology. [146] They chemically induced
random mutations in flies and looked for developmental phenotypes.
The location of the genetic change driving the phenotype was then
mapped by crossing mutant flies with wild-type flies and looking for
markers which co-segregated with the phenotype.
One problem with employing such an approach in Plasmodium is
the intense difficulty of crossing parasites due to the complexities of
the sexual cycle. This would likely mean a need for whole-genome
sequencing to identify the mutations present in cloned parasites with
a phenotype. Such an approach might be possible in this age of ever-
cheaper sequencing, but the mutation rate of the parasite (Hamilton,
Claessens, et al., unpublished) means that multiple mutations might
be identified leaving the causative one unclear.
Transposon mutagenesis has become a favoured way to randomly
create loss-of-function mutations which can be easily mapped by meth-
ods such as inverse PCR. The technique has been employed effectively
in large numbers of organisms including Plasmodium, where it has
yielded many clones with mapped insertions. [14] However until the
development of recent unpublished techniques (similar to barcode-
sequencing) the phenotyping of these parasites has been relatively
laborious.
By contrast, a reverse genetics approach disrupts a known region of
the genome and then looks for any resultant phenotype. This has the
advantage of allowing a targeted investigation of a subset of genes,
for example every kinase in the Plasmodium genome. [189, 80]
The discovery of RNAi has simplified reverse genetics approaches
in many organisms by allowing easy knock-downs of gene expres-
sion, but Plasmodium lacks the machinery for this process and so it is
not a technique available to malariologists. [19]
Instead, at least until alternative techniques such as dCas9 are per-
fected [133], targeted mutations must be made in the parasite genome.
The targeted nature of these genetic alterations has some advantages
over forward genetics approaches genes can be deleted, rather than
the creation of a frameshift or the insertion of a premature stop codon.
34 introduction
1.8.4 History of experimental genetics in malaria
When the first P. falciparum transfection was carried out in 1995, the
authors wrote that ’one problem has been the cloning of (A+T)-rich
sequences flanking the coding regions of P. falciparum genes, as many
recombinant DNA constructs containing these sequences are unsta-
ble in Escherichia coli’. [208] They had found that the P. falciparum
dhfr gene, which they hoped to use (in mutant form) as a selectable
marker, spontaneously re-arranged during cloning.
In the two decades since these first transfections it is estimated that
only ~500 Plasmodium genes have been successfully targeted for gene
disruption, [48] the majority in P. berghei. One challenge for genetic
modification has been low transfection efficiencies. This is perhaps
unsurprising considering the number of membranes that DNA has
to cross to reach the nucleus. To successfully transfect ring or tropho-
zoite stage parasites, DNA must cross four membranes: the red blood
cell plasma membrane, the parasitophorous vacuole membrane, the
parasite plasma membrane and finally the nuclear envelope.
A number of techniques have been developed attempting to in-
crease transfection efficiency. These include, ‘pre-loading’ erythrocytes
with DNA and then introducing parasites to them, electroporation of
ring stage and schizont stage parasites and ‘double-tap’ combinations
of these. [88, 27] Anecdotally, the schizont appears the parasite stage
most amenable to transfection. [142] Additionally, it seems that the
square wave produced by Lonza electroporators enhances DNA up-
take. [122]
1.9 the plasmogem project
The Plasmodium genetic modification project (PlasmoGEM) at the Well-
come Trust Sanger Institute aims to increase the efficiency and scale
of Plasmodium genetics using new genetic technologies and plate-
based vector generation techniques.
1.9 the plasmogem project 35
1.9.1 Linear cloning vectors allow the creation of more efficient targeting
vectors
As previously mentioned, one of the problems of working with Plas-
modium is that the parasites have extremely (A+T)-rich DNA which
is difficult to propagate in circular vectors using the conventional lab-
oratory workhorse Escherichia coli.
The first important component of PlasmoGEM was the creation of
a large-insert library of P. berghei genomic DNA in the pJAZZ vector
developed by Lucigen. This vector is derived from the N15 coliphage,
which has a linear genome of double-stranded DNA flanked by co-
valently closed hairpins created by the phage-encoded protelomerase
TelN. [78] The pJAZZ vector is a version of this phage which lacks the
genes for the phage’s lytic cycle, replacing them with a drug resis-
tance marker and a multiple cloning site. It is maintained in the TSA
strain of E. coli which has the TelN gene integrated into the bacterial
genome.
The linear topology of the vector means that unlike a closed circular
plasmid it is not possible for supercoils to be formed. Superhelicity
is largely responsible for the instability of (A+T)-rich DNA and so
pJAZZ vectors can allow inserts of up to 28 kb of Plasmodium DNA
to be maintained. [154]
The linear topology of the vector has a secondary advantage. Be-
cause vectors are never circular, even inefficient restriction enzyme
digestion to release inserts will not leave circular vectors with the
ability to form episomes that can confer drug resistance to parasites.
A library of genomic fragments inside pJAZZ backbones has now
been produced covering 91% of P. berghei genes, and smaller pilot li-
braries have been produced for P. knowlesi and P. falciparum. Capillary
sequencing has been used to map the position of each library clone
in the relevant parasite genome.
1.9.2 A robust pipeline for vector production increases the scale of genetic
studies
The second important component of the PlasmoGEM project is a high-
throughput procedure for taking a clone from this library, containing
36 introduction
GOI 5’ UTR
Gene of interest
GOI 3’ UTR
GOI 5’ UTR
Gene of interest
GOI 3’ UTR
Zeo/PheS
recombineering
Gateway reaction
PCR
Z
e
o
/
P
h
e
S
Zeo/PheS
PCR with primer overhangs
Gene of interest
HA tag 3’ UTR 5’ UTR dhfr/yfcu 3’ UTR
(positive selection with Zeocin)
(negative selection with d,l-p-chlorophenylalanine)
Library clone
Zeo intermediate
Transfection vector
Figure 13: Procedure for the generation of tagging PlasmoGEM constructs
GOI 5’ UTR
Gene of interest
GOI 3’ UTR
GOI 5’ UTR
GOI 3’ UTR
Zeo/PheS
recombineering
Gateway reaction
PCR
Z
e
o
/
P
h
e
S
Zeo/PheS
PCR with primer overhangs
HA tag 3’ UTR 5’ UTR dhfr/yfcu 3’ UTR
(positive selection with Zeocin)
(negative selection with d,l-p-chlorophenylalanine)
Library clone
Zeo intermediate
Transfection vector
Figure 14: Procedure for the generation of knock-out PlasmoGEM constructs
1.9 the plasmogem project 37
wild-type parasite genomic sequence, and turning it into a target-
ing vector containing a drug resistance marker. For these purposes, a
pipeline involving recombineering and Gateway technology has been
developed. [154]
In this methodology the library clone containing the target locus
is transformed with a plasmid encoding the bacteriophage λ red re-
combinase operon and E. coli recA recombinase. After 16 hours of
incubation to allow this protein to be expressed, the bacteria are tran-
siently transformed with a cassette expressing the zeo-pheS positive-
negative selection marker. The cassette is flanked by short regions of
homology to the genomic clone, added by PCR with long primers.
These permit the recombinase to exchange the cassette into the ge-
nomic clone, replacing the gene of interest in the case of a knockout
vector, or immediately following it in the case of a tagging vector.
The zeo-pheS cassette allows positive-negative selection and is flanked
by attLR sites. This means that the vector containing it is a powerful
intermediate; a Gateway recombinase [213] reaction can be used to
insert any of a range of different cassettes into the vector, each encod-
ing a drug resistance marker and, in the case of tagging vectors, an
epitope tag.
1.9.3 PlasmoGEM vectors carry barcodes which enable novel approaches
to genetic screens
An important component of the PlasmoGEM approach is the barcode
carried in each vector. During the PCR of the Zeo-PheS cassette a
unique 11 bp sequence is added to each vector, flanked by constant
annealing sites. This barcode identifies the gene targeted by the vec-
tor. The barcodes are designed with redundancy such that even the
mutation of two nucleotides will not cause one barcode to become
another.
A PCR reaction with the same primers can be used to amplify bar-
codes from any vector, or from the genome of a parasite transfected
with any vector. The advantage of this approach is that a pool of dif-
ferent vectors can be transfected together into a parasite population.
When selection is applied to kill parasites which have not integrated
a vector, a pool of transgenic parasites remains, with different genes
38 introduction
Gene A: 80%
Gene B: 20%
Gene C: 0%
Gene A: 60%
Gene B: 40%
Gene C: 0%
Gene A: 33%
Gene B: 33%
Gene C: 33%
normal
essentiality
slow
day 5day 4day 0
Barcode counting
by Illumina seq.
Pooled transfection of
schizonts
Infection timecourse
->day 8
PCR PCR PCR
Figure 15: Schematic view of barcode seqencing
A pool is made of barcode-knock out constructs targeting a num-
ber of different genes. In this case there are three constructs, one
targeting a redundant gene (green), one a gene whose knock-out
parasites are attenuated (yellow), and one an essential gene (red).
These constructs are transfected into schizonts and integrate into
the parasite genome, producing a mixed population of knock-
outs with the gene knocked out in each case indicated by the
presence of a barcode. Selection is applied to kill parasites not
carrying a vector, and from days 4 to 8 of the infection a small
sample of blood is taken each day. These samples and the in-
put pool are subjected to PCR to amplify their barcodes. These
are then quantified by Illumina sequencing. Essential genes can
be identified by the presence of their barcodes in the input pool
but absence in the alive transfected parasite population (because
all parasites integrating them have died as a result). Attenuating
and redundant knock-outs can be distinguished by comparing
the fold-change on successive days of the infection.
1.10 approaches to enhance transfection further 39
deleted in different parasites. The gene deleted in each case is indi-
cated by the barcode incorporated into that parasite. This means that
the ratio of different barcodes in the population can be used to follow
the relative growth of parasites with different genes knocked out.
Because all barcodes are flanked by the same primer annealing
sites, a single PCR reaction can amplify all of the different barcodes in
the pool, maintaining their relative abundances. Massively parallel se-
quencing (section 1.5) can then be used to sequence many thousands
of these barcode molecules in order to quantify the ratios within
them.
The incorporation of index tags during the preparation of barcodes
can allow at least 32 different barcode pools to be sequenced on a sin-
gle MiSeq lane. These points can represent independent replicates of
an experiment, and also different points in time during an individual
experiment.
This means that a pool of vectors can be transfected into a para-
site, and then DNA extracted from the parasite population daily as it
grows up under selection. When barcodes are counted for each day,
the relative growth rate of parasites with different genotypes can be
inferred. Additionally those genes whose barcodes never appear in
the population can be identified as refractory to deletion.
This technique has been developed and validated in P. berghei, where
it has been shown that a single pooled experiment recapitulates the
phenotypes found in previous single-transfection experiments. [79]
1.10 approaches to enhance transfection further
1.10.1 Increasing the basal growth rate
What makes Plasmodium berghei such an efficient model system for
malaria experimental genetics? One basic factor which should not be
overlooked is that the parasite grows very fast. During the exponen-
tial phase of an infection parasitaemia increases tenfold each twenty-
four hour cycle. P. falciparum by contrast rarely exceeds an 8-fold in-
crease over a forty-eight hour cycle. Because growth is exponential, in 6
days a single P. berghei parasite will have produced a million progeny
while a P. falciparum parasite has in the same time produced at most
40 introduction
512. Our inability to achieve such growth rates in P. falciparum and P.
knowlesi is likely an absolute limitation of culture systems - although
we do know some ways to increase the re-invasion rate, such as using
an orbital shaker. [7]
Differences in basal growth rate clearly affect how quickly one can
obtain mutants, but they also define whether one can obtain a par-
ticular mutant. A particular gene, when mutated, might decrease a
parasite’s growth rate to 25% of its wild-type value. For P. berghei this
is a decrease from 10-fold to 2.5-fold growth. In the case of P. knowlesi
this would be a decrease from 3-fold growth to 0.75-fold growth. But
of course a growth rate less than 1 means that the transgenic para-
site will never be seen, however long culture is continued. The higher
the basal growth rate the more genes are experimentally accessible,
making optimising culture conditions especially important.
1.10.2 The rate of DNA integration differs between Plasmodium species
This rapid growth is not the only factor that makes P. berghei so sat-
isfactory to work with. On day 4 after a transfection of P. berghei, a
low parasitaemia is visible under selection. The vast majority of these
parasites (we know from the reliability of barseq experiments) are
the result of correctly targeted double crossover events. Day 4 for P.
berghei approximately corresponds to day 8 for P. knowlesi, due to the
lower growth rate. In optimal P. knowlesi experiments even on day 8
60% of the population does not contain the expected genotype [142],
and this is despite the fact that the raw transfection efficiency of P.
knowlesi, the ease with which we can get DNA into the parasites, is
orders of magnitude higher than in P. berghei.
There appears to be a distinction between how easy it is to get DNA
into the parasite and how easy it is to get that DNA integrated into
the parasite genome (Table 1). Now that the former seems very well
optimised for P. knowlesi perhaps there are approaches from other
systems which can help to improve the latter.
1.10 approaches to enhance transfection further 41
parasite species transfection eff. linear dna
P. falciparum Low Very low
P. berghei Medium High
P. knowlesi Very high Medium
Table 1: Comparison of transfectability of the three parasite species analysed
in this work.
1.10.3 Enhancing integration efficiencies
A number of methods for generating double strand breaks (DSBs) on
demand have now been developed, initially to assist in the study of
the machinery of recombination, [169] but later to enhance integration
efficiencies.
Initially DSBs were created by inserting sites for the site-specific
eukaryotic nuclease SceI into the region to be studied. Since this point
a succession of nucleases have been developed, each one allowing
more convenient targeting than the last.
Zinc finger nucleases (ZFNs) were developed by fusing a non-specific
nuclease domain from the FokI enzyme to zinc-fingers, protein mo-
tifs from transcription factors which bind to specific triplets of DNA.
Once constructed, these allowed DSBs to be created at any site of in-
terest, allowing significant increases in targeting efficiency in many
systems [136] including Plasmodium [184, 144].
Once a validated ZFN is produced it is a very effective method of
inducing DSBs. The downsides of ZFNs are practical: they are expen-
sive, requiring synthesis by an external firm and a number of can-
didates typically need to be tested before one is found which works
efficiently. This would make scaling up a ZFN approach to large num-
bers of genes prohibitively expensive.
These problems were initally ameliorated with TALENs which re-
placed the zinc-fingers with DNA-binding domains from the TAL-
effector proteins of plant pathogens. These have a specific peptide se-
quence defined to bind to each nucleotide triplet, enabling in-house
synthesis from a distributed kit. But TALENs had a limited time at the
technological forefront before the next targeted-nuclease revolution.
42 introduction
1.10.3.1 Crispr-Cas9
In the short time since its development, Crispr-Cas9 technology has
radically changed many fields of genetic research. The Cas9 protein
is a nuclease which is guided to its target by binding a short guide
RNA sequence (gRNA). These gRNAs can be transcribed from DNA,
meaning that targeting Cas9 to a new section of the genome is as
simple as ordering a pair of oligonucleotides.
A number of studies have now reported success in using CRISPR-
Cas9 to modify malaria parasite genomes. [212, 72, 201] The approach
has allowed linear DNA to be used to target P. falciparum for the first
time [72], reducing the problem of episomes, and has also allowed the
selection of marker-free parasites with genomic modifications. [201]
Because of the absence of NHEJ machinery in Plasmodium (see sec-
tion 1.8.1), not all of the CRISPR approaches used in other eukaryotic
systems are possible in Plasmodium. It is always necessary to include a
template for repair, precluding mass-gene disruption by transfection
with libraries of guide RNAs.
Nevertheless this technology is already transforming Plasmodium
research and is likely to profoundly change approaches to malaria
reverse genetics as it is more widely adopted. In combination with
the high transfection efficiency of P. knowlesi it might be a powerful
approach.
1.10 approaches to enhance transfection further 43
Genomic DNA
Guide RNA
Cas9 protein
PAM site (NGG)
Figure 16: Schematic of CRISPR/Cas9 function
A guide RNA containing a binding region and a structural region
binds to the Cas9 protein and directs it to a region of the genome.
Once bound the Cas9 protein’s endonuclease activity cleaves the
DNA to create a double-strand break.
44 introduction
1.11 aims and objectives
My goal in this project was to genetically investigate proteins in-
volved in Plasmodium invasion at the largest scale currently possible,
using PlasmoGEM technology, and in the process to develop new ap-
proaches and tools. To these ends I had three main aims.
1.11.1 AIM 1: To conduct a screen for growth-phenotypes for putative in-
vasion related genes using PlasmoGEM vectors
Since large scale genetic disruption of invasion-related genes has not
previously been conducted, I aimed to perform such a screen using
the rodent model P. berghei. In Chapter 3 I will describe this screen,
which used barcode-sequencing to investigate the effect of deleting
each of 145 genes for which there is evidence of possible involvement
in invasion, using the in vivo barcode-sequencing technique that has
been developed in P. berghei. I also followed up a number of unex-
pected results with individual transfections.
1.11.2 AIM 2: To bring PlasmoGEM technology to P. knowlesi
There are potential limitations to analysing invasion in vivo and so I
also aimed to adapt the PlasmoGEM system for the generation and
transfection of constructs targeting the P. knowlesi genome. In Chap-
ter 4 I will discuss my optimisation of these techniques and the ul-
timately successful deletion and tagging of invasion-related genes in
this parasite with PlasmoGEM vectors.
1.11.3 AIM 3: Bioinformatic analysis of phenotypes based on protein-protein
interactions
Finally, the production of large scale phenotype resources necessitates
the development of new approaches to extract maximal insights from
these datasets. In Chapter 5 I will describe my efforts to uncover pat-
terns within my phenotyping data using protein-protein interaction
and transcriptomic datasets and to predict and explain mutant phe-
notypes on the basis of these relationships.
2
M AT E R I A L S A N D M E T H O D S
2.1 production of dna vectors
2.1.1 Cloning of P. knowlesi Gateway cassette
The PbGWR6K3HA vector, used for the production of P. berghei vec-
tors was digested with KpnI and PstI to release the P. berghei selection
cassette. Three fragments were amplified by PCR:
P. knowlesi Hsp70 5 UTR (primers: TPR321, TPR322; template: P.
knowlesi genomic DNA)
Dhfr/yfcu fusion (primers: TPR323, TPR324; template: PbGWR6K3HA)
P. knowlesi Hsp70 3 UTR (primers: TPR325, TPR326; template: P.
knowlesi genomic DNA).
These primers added overlaps so that the four fragments could be
assembled by Gibson Assembly [74] (NEB Gibson Assembly kit). The
product was transformed into One Shot Pir2 chemically competent E.
coli (Invitrogen).
The final construct is in the R6K backbone, meaning it can only
propagate in Pir+ E. coli ensuring no carryover in vector production.
The plasmid was prepared for subsequent work using a Qiagen Plas-
mid Maxi Kit.
2.1.2 pJAZZ vectors
The PlasmoGEM approach employs recombineering and Gateway tech-
nology to construct high-efficiency targeting vectors for Plasmodium
genes (see section 1.9). In the course of this project PlasmoGEM vec-
tors were used to target P. berghei and, for the first time, P. knowlesi.
One advantage of this method is that it is highly scalable—constructs
can be produced in a production setting by dedicated staff; but equally
the protocol can be carried out in tubes to process small numbers of
45
46 materials and methods
samples. Early proof of concept vectors I produced myself as follows.
Later, as scale increased, constructs were produced by essentially the
same procedure but in 96-well plates, by Gareth Girling and Burcu
Anar from the PlasmoGEM team.
2.1.2.1 Initial vector
TSA strain E. coli containing the desired region of Plasmodium genome
in a pJAZZ backbone were streaked from archival plates and acted
as the starting point for construct generation. An overnight culture
was prepared from this clone in Terrific Broth (TB) with 30 µg{ml
kanamycin.
2.1.2.2 Transformation with recombinase expression vector
The next day this culture was made electrocompetent as follows: it
was diluted to an OD
600
of 0.05 and incubated at 37
˝
C with shak-
ing until the OD
600
reached 0.60.8. At this point the culture was
placed on ice for 15 minutes before 1.4 ml was centrifuged (5000g, 2
mins, 4
˝
C), and washed with ice cold H
2
O. The washing procedure
was repeated 2 further times, then cells were resuspended in 50 µl of
0.5 ng{µl pSC101gbdA plasmid and transferred to a 1 mm gap electro-
poration cuvette. The cuvette was pulsed (1800V, 25 uF, 200 ) and
950 µl of TB was added; cells were allowed to recover at 30
˝
C with
shaking for 70 mins before the addition of antibiotic containing TB
(final concentrations: 30 µg{ml kanamycin, 5 µg{ml tetracycline )
2.1.3 Amplification of Zeo/PheS cassette
The position of the final Plasmodium cassette is determined by the
location at which a temporary positive-negative bacterial selection
cassette is inserted by recombineering. This is specified by amplify-
ing the cassette using primers which add overlaps of >50 nucleotides
homologous to the insertion site. These primers also add the barcode
used in barseq experiments.
The reaction is carried out with the following mixture: 15.5 µl H
2
O,
1 µl 12 ng{µl zeo/pheS plasmid template, 1.5µl Advantage buffer, 2.5 µl
2 µM recUp primer, 2.5 µl 2 µM recDown primer, 0.5µl 10 mM dNTPs,
2.1 production of dna vectors 47
0.5 µl Advantage Taq2 and cycling parameters: 95
˝
C 5 // 95
˝
C 30" /
58
˝
C 30" / 72
˝
C 130" (x30) // 72
˝
C 10’// 4
˝
C hold.
After amplification, residual template was digested by the addition
of 1 µl DpnI and incubation at 37
˝
C for 10 minutes, then salts were
removed by dialysis against water on a 0.1 µM Millipore filter.
The bacteria transformed with recombinase were diluted to an OD
600
of 0.05 after overnight incubation. They were allowed to grow to an
OD
600
of 0.30.4 and then L-arabinose was added to a final concen-
tration of 0.2% to induce the pBAD promoter that controls the λ red
recombinase, and the temperature was shifted to 37
˝
C for 40 minutes
(the first 5 in a water bath).
The cells were then transfected with the PCR amplicon as described
in 2.1.2.2. This time the recovery period was at 37
˝
C and the final
antibiotics added were zeocin and kanamycin.
The next day DNA from the (uncloned) culture of zeocin and kanamycin
resistant bacteria was extracted using a QIAGEN Spin Miniprep kit
(according to the manufacturers’ instructions but with 2x volumes of
P1, P2 and N3 due to the low copy number of pJAZZ vectors).
This DNA is the Gateway intermediate. It can generate a number
of different possible final vectors through a Gateway reaction with
different donor vectors.
2.1.4 Gateway reaction and transformation
Gateway reactions were set up with 10 µl intermediate pJAZZ vector
(~30 ng{µl), 1 µl donor (100 ng{µl) in TE, 4 µl LR clonase buffer, 2 µl
LR clonase enzyme mix, 3µl TE. The Gateway reaction was incubated
overnight at 25
˝
C, then 0.5 µl proteinase K was added and the tube
incubated at 37
˝
C for 10 minutes to inactivate the reaction. The prod-
uct was dialysed against ddH2O on Millipore filters and 5 µl of this
dialysed product was transformed into 50 µl BigEasy-TSA cells with
the parameters described in 2.1.2.2.
The cells were allowed to recover in 1 ml of TB at 37
˝
C for 70
minutes and the bacteria then plated out on YEG-Cl kanamycin agar
plates and incubated overnight at 37
˝
C, to select for bacteria that had
lost the Zeo/pheS cassette in the Gateway reaction (by exchanging it
for the final cassette).
48 materials and methods
2.1.5 Clone verification
Colonies on plates the next day represented potential final targeting
vectors. These were screened by lysate PCR they were checked for
the presence of a correct recombineering event (primers GW1 and
QCR2), and for the absence of carry-over wildtype plasmid (primers
QCR1 and QCR2) in two seperate PCR reactions.
Reaction components: 5 µl H2O, 2.5 µl colony lysate, 12.5 µl GoTaq
mastermix, 2.5 µl 2µM primer 1, 2.5 µl 2µM primer 2
Cycling parameters : 95
˝
C 5 // 95
˝
C 30”/ 50
˝
C 30 / 68
˝
C 1 (x30)
// 68
˝
C 10 / 4
˝
C hold
2.1.6 Plate-based protocol
Where vectors were generated by plates the method employed was
similar to that above in tubes, but with kits optimised for 96 well
plates. This protocol is described in more detail in [154]. This work
was carried out by Gareth Girling and Burcu Anar.
2.1.7 Quality control
Constructs produced through these processes were verified by Illu-
mina sequencing (analysis conducted by Frank Schwach) to exclude
constructs in which any of the following events had occurred:
Any mutations in the barcode or the primer annealing sites
flanking it
A large deletion
Any mutation that affects either a protein-coding sequence or
splice site of any gene within the vector
This means that some QC-passed vectors are not base-perfect. The
majority of these cases are insertions or deletions of A or T residues
into homopolymeric repeats. Such changes are probably inevitable
when Plasmodium sequence is maintained in E. coli.
2.1 production of dna vectors 49
reagent volume / µ l
Forward oligo (100µM) 9
Reverse oligo (100µM) 9
10X NEBuffer 2.1 2
Total 20
Table 2: Annealing reaction for the creation of dsDNA inserts encoding
guide RNAs.
2.1.8 Crispr vectors
2.1.8.1 Cloning of mother vector
The starting point for the CRISPR constructs was pDC2 Cas9-U6, de-
veloped by Marcus Lee, with the selection marker changed to yDHODH
by Zenon Zenonos. A PCR reaction was used to amplify the P. knowlesi
U6 promoter with overlaps corresponding to the insertion site in the
destination vector (primers: TPR458, TPR459; template: P. knowlesi ge-
nomic DNA). The two P. falciparum constructs, one with dhfr and one
with yDHODH were cut with SalI and BbsI to remove the P. falci-
parum U6 promoter. I assembled these two fragments together with
Gibson Assembly.
Though heterologous promoters are functional in P. knowlesi, I wanted
to rule out any inefficiency due to this and so I amplified the bidirec-
tional elongation factor 1 alpha promoter primers: TPR467, TPR468;
95
˝
C 2 //98
˝
C 20 / 55
˝
C 30 / 60
˝
C 3 (x35) // 60
˝
C 10 // and
inserted it with Gibson Assembly between yDODH and Cas9 (after
digesting with AvrII and NcoI).
2.1.8.2 Guide RNA design
2.1.8.3 Cloning of guide RNAs into vectors
Guide RNAs were ordered as two complementary oligos with 4 nu-
cleotide 5 overhangs complementary to the insertion site. The mother
vector was digested with Type IIS restriction enzyme BbsI to create
the site for insertion of the guide RNA.
The oligos were annealed by setting up the reaction shown in Table
2 and running the following programme in a thermocycler.
50 materials and methods
reagent concentration
RPMI-1640 10.43 g/l
L-glutamine 2 mM
HEPES 7.15 g{l
Hypoxanthine 50 mg{l
Horse serum 10% (v/v)
Albumax 0.5% (w/v)
Sodium bicarbonate 0.24% (w/v)
Table 3: Recipe for P. knowlesi culture medium
1. Incubate at 95
˝
C for 10 minutes.
2. Ramp temperature down by 10
˝
C at a rate of 0.6
˝
C per second.
3. Incubate at this reduced temperature for 1 minute.
4. Repeat steps 2 and 3 until tube reaches 25
˝
C.
From then on the annealed oligos were kept on ice.
The annealed product was diluted 50x in ddH2O, and 1 µl of the
diluted DNA was ligated into 50100 ng of digested plasmid in a
10 µl reaction with 1 µl T4 ligase (NEB) and then transformed into
OneShot TOP10 cells according to the manufacturer’s protocol.
2.2 plasmodium knowlesi in vitro culture
2.2.1 Routine culture
Parasites were maintained in P. knowlesi culture media (Table 3). The
horse serum was added to the medium last and was not filter ster-
ilised. All other components were. Except where otherwise specified,
parasites were maintained at 2% haematocrit.
Leukocyte-depleted O+ blood was obtained from the National Blood
Service approximately weekly. Since P. knowlesi is dependent on the
DARC receptor for successful invasion blood was screened by a ba-
sic invasion assay prior to use in important cultures schizonts were
purified and allowed to invade the new blood, then a Giemsa smear
was made to check this was proceeding at normal levels.
plasmodium knowlesi in vitro culture 51
2.2.2 Routine pelleting
When necessary to wash, or change media, parasites were pelleted by
spinning at 1100g for 5 minutes.
2.2.3 Schizont purification
A Nycodenz stock solution was made by dissolving Nycodenz (or
alternatively Histodenz, which is chemically identical) at a concen-
tration of 27.6 g{l. HEPES was added to a final concentration of 1
mM and the pH adjusted to 7.2 with hydrochloric acid. A cushion
solution was then prepared by making a 55% dilution of this stock in
RPMI-1640.
In order to purify schizonts, cultures were pelleted as in 2.2.2 and
resuspended in up to 9 ml RPMI-1640. This was overlaid onto 5ml
of cushion solution and the gradient was centrifuged at 1500g for 10
minutes with a low brake setting.
Schizonts settled at the interface and were extracted with a pasteur
pipette. They were washed by adding at least one volume of RPMI-
1640 and pelleting as in 2.2.2.
2.2.4 Synchronisation
It is reported (Rob Moon, personal communication), that sorbitol-
lysis of late stages the primary synchronisation method used for
P. falciparum is not effective for P. knowlesi. Because of this, the pri-
mary method of synchronisation used for this work was Nycodenz
purification of schizonts as described above, and then selecting either
the schizont-containing interface or the pellet containing the rings
and trophozoites for continued culture.
2.2.4.1 Narrow window synchronisation
For some procedures, in particular transfection, it was necessary to
obtain very synchronous cultures in which one could say with cer-
tainty that (almost) all parasites were within a defined range of ages.
Here a double synchronisation procedure was employed.
52 materials and methods
First, schizonts were purified as described in section 2.2.3. These
schizonts were added to fresh blood and allowed to reinvade for 60
to 90 minutes with shaking. Over this time a significant portion of
schizonts rupture to release merozoites which reinvade and result in
rings. The culture was examined to ensure that a significant number
of rings had formed and then the schizont purification was repeated
but this time the schizonts were carefully removed, and the pellet
resuspended in complete media and returned to culture. This culture
contains only very young rings, as shown in Fig. 17.
(a) Mixed culture with mostly late
stages
(b) Schizonts purified by centrifuga-
tion onto a Nycodenz cushion.
(c) Culture after allowing 90 min-
utes of reinvasion, showing young
rings and unruptured schizonts
(d) Young rings after purifying away
unruptured schizonts. These para-
sites are all synchronised to within
90 minutes.
Figure 17: Representative Giemsa-stained smears from the various stages of the
tight synchronisation protocol
2.2.5 Preparation of DNA for transfection
For transfection of pJAZZ vectors, 100 ml of Terrific Broth (TB) was
supplemented with Kanamycin (30 µg{µl). A 10% w/v stock of L-
arabinose was prepared in water and filter-sterilised. 100µl was added
plasmodium knowlesi in vitro culture 53
per 100 ml of TB. L-arabinose transiently increases the copy number
of pJAZZ vectors for a higher yield.
Cultures were incubated for 16 hours at 37
˝
C with shaking. DNA
was then prepared using a QIAFilter Plasmid Midiprep kit, accord-
ing to the manufacturer’s instructions. The pellet was resuspended
in 50 µl TE, quantified, and cut with NotI-HF (NEB). DNA was pre-
cipitated with 10% sodium acetate and 0.7 volumes isopropanol at
´20
˝
C, and washed with 70% ethanol before it was dissolved in 15 µl
sterile TE.
2.2.6 Transfection
Parasites were synchronised within a narrow window (as described
in 2.2.4.1) on the day before each transfection. Around 26 hours af-
ter parasites had been returned to culture they were smeared and
Giemsa-stained. The culture was stained regularly (every hour, then
every 30 minutes) until it reached a stage where the majority of par-
asites were very late schizonts, with haemozoin massing at a single
point per parasite, and the first rings were appearing from invasion
of the most rapidly maturing schizonts. At this point, schizonts were
purified as described in 2.2.3.
Transfections were typically carried out in batches: schizonts were
diluted in complete media such that each 1 ml of media contained
5–10 µl of schizonts. A 1 ml aliquot of this schizont-containing media
was prepared in an eppendorf tube for each transfection.
For each transfection, one of these 1 ml aliquots was centrifuged
(2000 rpm, 90 seconds) to pellet the schizonts. The supernatant was
removed and schizonts were resuspended in 100 µl transfection mix
(100 µl Lonza nucleofector P3 mixed with up to 10 µl DNA in TE).
100 µl of the schizont-nucleofector-DNA mix was transferred to a nu-
cleofector cuvette and then electroporated with programme FP-158
within 30 seconds. The contents of the cuvette were immediately re-
moved and added to a pre-warmed tube containing 650 µl complete
media and 150 µl packed freshly washed erythrocytes. This tube was
placed on a thermomixer at 37
˝
C and incubated with shaking for 30
minutes to one hour. After this time, the contents of the recovery tube
were added to 5 ml of pre-warmed media in a well of a 6-well plate.
54 materials and methods
Selection was initiated on day 1 or 2 post-transfection, with media
changed daily for the first 5 days post-transfection.
2.2.7 Limiting dilution cloning
Parasite cultures were prepared at a haematocrit of 2% and a para-
sitaemia of 1 ´ 3%. They were counted by live SYBR-green staining
as in section 2.2.8, or by counting parasitaemia in 1000 erythrocytes
by Giemsa smear, and diluted successively in RPMI-1640 to a par-
asitaemia of 0.5 parasites per 10 µl. 10 µl was then transferred into
200 µl of 2% haematocrit media in a 96 well plate. Media was changed
every 3 days. Approximately 2 weeks after the plates were set up, the
media in some wells would turn yellow due to the presence of a popu-
lation of parasites. These wells were screened by Giemsa smear, found
to be positive and transferred to larger cultures. Negative wells were
left for an additional week and then screened by Giemsa smear to
confirm the absence of parasites—indicating positive wells are likely
to indeed be clonal.
2.2.8 Live SYBR green flow cytometry
The starting point for staining was in each case a 96-well plate with
wells containing 50 µl of parasite culture at 2% haematocrit. Staining
was initiated by adding 15 µl SYBR green diluted 1:1000 in RPMI-
1640. Plates were then incubated at 37
˝
C for 1 hour. They were then
washed twice in 200 µl PBS, and diluted 1:10 into 100 µl PBS before
acquisition using a BD FACScalibur flow cytometer.
Analysis was performed using FlowJo. Cells were first gated for for-
ward and side-scatter to separate erythrocytes from any debris. This
erythrocyte population was then divided into infected and uninfected
cells based on SYBR-Green fluorescence detected in the FL-1 channel
to calculate parasitaemia.
2.2.9 Luciferase assay
Culture volume to be analysed was pelleted as in 2.2.2. The super-
natant was removed and 1 ml of 0.15% saponin was added to lyse
plasmodium knowlesi in vitro culture 55
the red blood cells. The tube was incubated at room temperature for
five minutes, then spun at 2000g for 3 minutes and the supernatant
discarded. The pellet was washed in PBS until the supernatant was
no longer red (2000g, 3 minutes). The parasite pellet was then resus-
pended in 65 µl 1x Passive Lysis Buffer (Promega 5x Passive Lysis
Buffer diluted in water), and incubated with shaking for 10-15 min-
utes at room temperature to lyse parasites. The mixture was pipetted
up and down to mix, then centrifuged (1000g, 3 minutes) to clear the
supernatant. Triplicate 20 µl samples were pipetted into the wells of
a luminometer plate. Immediately before reading 50 µl of Promega
Bright-Glo reagent were added to each well. Luminescence counts
were measured for 10 seconds per well.
2.2.10 Genotyping
2.2.10.1 DNA extraction by phenol-chloroform extraction
Where a maximal yield of DNA was needed from low parasitaemia
samples, DNA was purified by phenol-chloroform extraction. Red
blood cells were lysed by adding 10 pellet volumes of ammonium
chloride lysis buffer 0.15 M NH
4
Cl, 0.01 M KHCO
3
, 1 mM Na
2
EDTA;
pH 7.4) and incubating on ice for 5 minutes. The lysate was cen-
trifuged (5000g, 5 minutes) and the parasite pellet resuspended in
500 µl TNE buffer (50 mM Tris-HCl, 100 mM NaCl, 5 mM EDTA; pH
8.0)
The lysate was transferred to a PhaseLock gel tube and 500 µl of
25:24:1 phenol:chloroform:isoamyl alcohol was added. The tube was
mixed by thorough shaking and then centrifuged (10000g, 5 minutes).
The aqueuous supernatant was tranferred to a new PhaseLock tube
and 500 µl of chloroform-isoamyl alcohol was added. The tube was
mixed thoroughly and then centrifuged at 10000g for 5 minutes. The
aqueous phase was transferred to a new tube, 3 µl pellet paint was
added and the DNA was precipitated overnight at ´20
˝
C after the
addition of 500 µl isopropanol.
The isopropanol was removed and the pellet dried and then resus-
pended in 50 µl or 100 µl TE.
56 materials and methods
2.2.10.2 DNA extraction by QIAGEN kit
Where lower yields of DNA were viable the Qiagen DNeasy Blood
and Tissue kit was used according to the manufacturer’s instructions.
For whole-genome sequencing large cultures were lysed with 0.15%
saponin to maximise yield without overloading the columns.
2.2.11 Genotyping by short range PCR
To confirm the presence or absence of amplicons shorter than 3 kb,
25 µl PCR reactions with GoTaq Green Master Mix were carried out.
The cycling conditions were: 95
˝
C 2 //95
˝
C 30 / 55
˝
C 30 / 68
˝
C 1
per kb (x35) // 68
˝
C 10 //
2.2.12 Genotyping by single barcode PCR
To ensure parasites which came up were not the result of cross-contamination
with other PlasmoGEM parasites, nor cross-contamination of vector
DNA prior to transfection, the barcode from the PlasmoGEM vector
transfected was on occasion amplified and Sanger Sequenced (GATC
Biotech.). Amplification was carried out with GoTaq green (2.2.11)
and DNA was purified with a QIAGen MinElute PCR purification
kit. The primers used were arg97 and PbGW2, with the sequencing
read made with PbGW2.
2.2.13 Genotyping by integration PCR
To check for the presence of successful integration into the parasite
genome long range PCR was carried out with GoTaq Long PCR Mas-
ter Mix, according to the manufacturer’s instructions.
2.2.14 Genotyping by quantitative PCR
Reactions were set up in white LightCycler plates with 10 µl Bio-
Rad SsoAdvanced universal SYBR Green supermix, 1 µl 10 µM for-
ward primer, 1 µl 10 µM reverse primer, 2 µl DNA template and 6 µl
plasmodium knowlesi in vitro culture 57
nuclease-free water. Where error bars are shown they are the 95%
confidence interval calculated from 3 replicates.
The light-cycler programme was 95
˝
C 5 // 95
˝
C 10”/ 57
˝
C 10
/ 72
˝
C 20 (x45)). A melt curve was then carried out to ensure the
presence of a single amplicon.
2.2.15 Genotyping by whole-genome sequencing
Genomic DNA was extracted from parasites by saponin lysis of a
large culture (>50ml at >5% parasitaemia), resuspending the parasite
pellet in 200ul of PBS and processing with a QIAGEN DNeasy Blood
and Tissue Kit as per the manufacturer’s instructions for a blood sam-
ple.
Sequencing was carried out by the Illumina team at the Sanger
Institute. DNA was quantified by Qubit Fluorometric Quantitation,
then sheared into 400600 bp fragments. Libraries were prepared
with the NoPCR protocol. [111] Multiplexed samples were sequenced
on the Illumina MiSeq platform for 150 paired ends cycles accord-
ing to the manufacturer’s protocol. Data were analysed from the Illu-
mina sequencing machines using RTA1.6, RTA1.8 or GA v0.3 analysis
pipelines.
All downstream analysis I performed myself, as outlined below.
2.2.15.1 Mapping target
All P. knowlesi H strain chromosomes, and Mitochondrial, Apicoplast
and bin sequences from the March 2015 genome release were down-
loaded in EMBL format. A custom EMBL file was created by concate-
nating these and also the gateway cassette used in transfection, and
the arms of the pJAZZ-OK vector. A mapping index was constructed
for this custom genome with bowtie2-build.
2.2.15.2 Mapping
CRAM files produced by the Sanger sequencing pipeline were con-
verted to FastQ files for customised analysis.
Bowtie2 was used to map these to the customised EMBL file de-
scribed above as follows:
58 materials and methods
/software/bowtie2/bin/bowtie2 -x union -1 ../$1
_
1.fastq -2 ../
$1
_
2.fastq -S $1.sam
samtools view -bS $1.sam | samtools sort - $1.bam.sorted
samtools index $1.bam.sorted.bam
Mapped reads were analysed with Artemis to observe differences in
coverage and points where split-reads indicating junctions between
transfected parasite and reference.
2.2.15.3 Assembly
Reads were also de novo assembled using velvet as follows, using a
k-mer size of 51:
velveth $assemblyfile 51 -shortPaired -fastq -separate $1
_
1.fastq
$1
_
2.fastq
velvetg $assemblyfile
2.2.16 Immunofluorescence
2.2.16.1 Fixation and labelling
All steps were carried out with agitation.
Parasites were fixed in a fixing solution comprising 4% paraformalde-
hyde and 0.0075% glutaraldehyde in PBS [193] for 30 minutes. They
were then permeabilised for 10 minutes at room temperature with
0.1% Triton-X in PBS, and blocked for 1 hour with 3% BSA in PBS
at room temperature. They were labelled for 18 hours with FITC-
conjugated goat anti-HA at 4
˝
C and then washed three times with
PBS for ten minutes at room temperature. Parasites were placed on
a slide and allowed to air dry and then mounted with Prolong Gold
mounting reagent containing DAPI. f
2.3 plasmodium berghei
2.3.1 Pooled midiprep
To minimise expense and labour in DNA production, a pooled midiprep
procedure was adopted to produce the DNA for transfection in P.
berghei experiments. Growth blocks were set up with 1 ml of TB-kan
2.3 plasmodium berghei 59
per well, and for each transfection two wells were inoculated with
each vector to be used in an experiment. After 16 hours of growth at
37
˝
C with shaking, all wells from a single experiment were pooled
together and then processed with a QIAFilter Midi kit (according to
manufacturer’s instructions but using 2x volumes of P1, P2 and P3
due to the use of a low copy vector).
The resulting DNA was spiked with DNA for 7 reference vectors
(prepared by Ana Rita Gomes) 4 redundant in the blood stages (p25,
p28, soap, p230p) and three known to result in attenuated growth
(plasmepsin IV, PBANKA_140160 and PBANKA_110420). The use
of these universal references allows calibrated comparison across all
PlasmoGEM barseq experiments.
Prior to transfection the pooled DNA was digested with NotI, pre-
cipitated with isopropanol and resuspended in 6 µl TE.
2.3.2 Schizont preparation and transfection
A Wistar rat was infected by intraperitoneal injection and monitored
until blood parasitaemia reached 1-3%. Infected blood was then har-
vested by cardiac puncture and prepared as a schizont culture: 25x
blood volume schizont medium (RPMI 1640 supplemented with 25mM
L-glutamine, 25 mM HEPES, 10mM NaHCO3, 100 U/ml penicillin/strep-
tomycin and 25% foetal bovine serum) was added and the flask gassed
for 90 seconds with malaria gas (1% O2, 3% CO2, 96% N2) and incu-
bated overnight with shaking.
The next day the culture was smeared to check for the presence of
mature schizonts and these were purified on a 55% Nycodenz/PBS
(v/v) cushion with low brake. The schizont-containing interface was
collected with a Pasteur pipette and washed with schizont medium.
Schizonts were pelleted by centrifugation (500g, 2 mins), and the pel-
let resuspended in 16 µl P3 solution (Lonza). The 6 µl vector pool was
then added to this mixture and transfection was carried out in 16-
well Lonza cuvette strips with programme FI-115. The transfection
mixture was immediately injected intravenously into the tail veins of
6-8 week old Balb/c mice.
The next day drinking water was supplemented with pyrimethamine
to begin selection (7 mg/ml).
60 materials and methods
For each pool three transfections were carried out in parallel and
injected into three mice to act as biological replicates.
The remaining transfection mixture after injection was kept as an
input control for barseq.
Parts 2.3.2 and 2.3.3 of the experiment was conducted by Ellen
Bushell, Ana Rita Gomes and Tom Metcalf; remaining steps I carried
out myself.
2.3.3 Time course
On days 4, 5, 6, 7 and 8, a small sample of blood (<30 µl) was collected
from the tail vein. Samples were collected at the same time on each
day, and Giemsa smears were also made to monitor the infection.
1
The erythrocytes in these time-point samples were lysed by adding
1 ml of lysis buffer (0.15 M NH
4
Cl, 0.01 M KHCO
3
, 1 mM Na
2
EDTA;
pH 7.4) and incubating on ice for 2 minutes. Parasites were then pel-
leted by centrifugation (3 mins, 1000g). Genomic DNA was isolated
by phenol-chloroform extraction (2.2.10.1)
2.3.4 Barcode amplification
PCR was used to amplify barcodes from this extracted genomic DNA.
A 50 µl Advantage 2 Polymerase reaction was prepared according to
the manufacturers instructions with the primers arg91 and arg97.
1 µl of phenol-chloroform extracted gDNA was used as the tem-
plate for the reaction. The protocol for the PCR was as follows: 95
˝
C
5 //95
˝
C 30 / 55
˝
C 20 / 68
˝
C 8 (x35) // 68
˝
C 10 //. A final
sample from which barcodes were amplified in parallel was the ‘in-
put pool’ - the DNA present in the DNA used for transfection. This is
important to ensure that absence of a barcode in the population is not
due simply to a failure of the bacteria carrying that vector to grow.
5 µl of the product from this first PCR reaction was then used as
the template for a second PCR which added the Illumina adapters
required for the product to adhere to the Illumina flow cell, and also
index tags to allow the sample corresponding to each read to be de-
1 The collection of the blood from the animals was performed by Ellen Bushell, Ana
Rita Gomes or Tom Metcalf.
2.4 data anal ysis 61
termined after sequencing. The cycling conditions were: 95
˝
C 5 //
95
˝
C 30 / 68
˝
C 15”(x10) // 68
˝
C 5 //
A water control was taken through the entire experiment to ensure
reagents were not contaminated with trace amounts of barcodes.
Following the second PCR, DNA was purified using a QIAGEN
MinElute PCR purification kit. Eluted DNA was quantified using the
Qubit system and 100 ng of each sample was taken and pooled to-
gether.
Up to 32 timepoints could be pooled on a single MiSeq lane.
2.3.5 Barcode sequencing
Once pooled libraries were further quantified by qPCR, diluted to
1 nM and spiked with 40-50% of PhiX (due to the low complexity
of the sequence). They were loaded at low cluster density 4 ˆ 10
5
clusters/mm
2
). 150 bp paired-end reads were collected. These steps
were conducted by the Sanger Institute Illumina C team.
2.4 data analysis
2.4.1 Barseq
The output from the Sanger sequencing pipeline was converted into
a FastQ file containing the raw reads using Picard. I used the script
count
_
tags.pl written by Frank Schwach to read through this FastQ
file and count the number of barcodes corresponding to each gene for
each time point.
Further analysis was substantially based on ideas developed in [80].
I created an R script
2
to analyse the barcode data.
The procedure for analysing STM data is as follows:
1. For each day (d, 4-8) and each mouse (m,1-3) calculate what
proportion (p) each gene (g) represents of the total number of
barcodes.
2 8% of lines in the script are unchanged from one written by Ana Rita Gomes, the
rest is original to me
62 materials and methods
p
d,m,g
b
d,m,g
G
n
ř
G
0
b
d,m,G
n
, where b
d,m,g
represents the number of bar-
codes for gene g counted on day d in mouse m.
We treat any ratio < 0.1% as background and set to 0.
2. For each mouse on days 4-7, calculate the fold-change in bar-
code number for each gene on the next day, i.e. c
d,m,g
p
d`1,m,g
p
d,m,g
.
3. For each mouse on days 4-7, calculate a relative fitness for each
gene by normalising this gene’s fold-change to the mean fold-
change of the reference genes. f
d,m,g
c
d,m,g
1
n
R
n
ř
R
0
c
d,m,R
n
4. For each gene perform a t-test comparing the dataset represent-
ing the fold-change on days 4-7, for all mice, with the reference
growth rate on days 4-7 for all mice.
5. Assign a phenotype to each gene as follows:
a) If two or more mice have no (or below background) read-
ings on days 4-7, call the phenotype as "Putative essential"
b) If two or more mice lack at least one reading (i.e. these
are below background), call the phenotype as "Inconsistent
(likely essential)".
c) In all other cases at least two mice have a full set of read-
ings on days 4-7. We can now examine the t-test p-values,
adjusted for multiple comparisons with the FDR method.
Where p<0.05 we call the phenotype as "Slow" or (hypo-
thetically) "Fast" depending on whether the relative fitness
is below or above 1. Where p>0.05, we call the phenotype
as "Putative redundant" (in the blood stages).
Part II
R E S U LT S
3
I N I T I A L I N V E S T I G AT I O N S : G E N E S E L E C T I O N
A N D P. B E R G H E I S C R E E N
3.1 assessing the hu et al. invadome
My overarching aim in this project was to apply high-throughput
reverse genetics to genes hypothesised to play a role in erythrocyte
invasion. As discussed in the introduction, one set of such genes was
described by Hu et al. in 2010 derived from the PlasmoINT interaction
network.
Before beginning experiments I analysed this gene set, to confirm
its suitability for investigation.
3.1.1 ‘Invadome’ genes have expression patterns consistent with a role in
invasion
One naïve heuristic for identifying genes potentially involved in in-
vasion would be to simply select genes expressed late in the intraery-
throcytic development cyle. This is the point at which the merozoites—
invasion machines—are being built. Does the Hu et al. invadome dif-
fer very much from the results of such an approach?
To investigate this I downloaded the 2003 Bozdech et al. microarray
timecourse, which provides values for gene expression in the HB3
strain of P. falciparum at hourly increments over the 48 hour IDC [22].
In this dataset many genes are covered by more than one probe, so
I computed an expression value for each gene at each timepoint by
taking the mean of all probes covering it. I then found the timepoint
of maximal expression for each gene, and subdivided this data de-
pending on whether or not the gene was a member of the Hu et al.
invadome.
Reassuringly this analysis (Fig. 18) showed that the vast majority of
the invadome genes are indeed expressed late in the erythrocytic cy-
cle, as we would expect (these genes are part of a network imputed,
65
66 initial investigations: gene selection and P. BERGHEI screen
0
50
100
150
200
0 10 20 30 40
Timepoint / hours
Genes with peak expression
Figure 18: Genes in the Hu et. al invadome peak late in the IDC, but do not encom-
pass all late-peaking genes.
Histogram showing the peak expression [22] of all P. falciparum
genes and the subset that are part of the invadome (blue).
in part, by correlated expression with other invasion genes which
will themselves peak late in the IDC). The median peak for invadome
genes was at 38 hours. Additionally I saw that only 50% of genes
showing maximal expression at this 38 hour point were part of the
invadome. This may indicate that the perturbation-based approach
used to generate the PlasmoINT network has provided a more spe-
cific subset of genes than one based entirely on expression over the
course of the IDC (intraerythrocytic development cycle), though it
does not prove this.
3.1.2 Gene Ontology analysis suggests the relevance of the invadome gene
list
I also looked for gene ontology terms which were enriched in this set.
As one might anticipate, the highest enrichment was for "entry into
host" with 8-fold enrichment.
However 11 genes annotated with the “entry into host” term were
not included in the Hu et al. set. I investigated these with literature
searches. Four were sporozoite proteins (CSP, CTRP, TRAP, MAEBL),
3.1 assessing the hu et al. invadome 67
two were involved in midgut invasion (PLP3, PLP5, [58]) and two
have no proven role in invasion other expression at the right stages
(P12, [187] and falcipain 1 [84, 174]).
I had been concerned that given the same organelles are involved
in invasion across the Plasmodium life-cycle, network-based perturba-
tion approach might have identified proteins which are involved in
cellular invasion, but operate primarily or exclusively in other stages
of the parasite life-cycle. The fact that several such genes are actually
excluded from the Hu et al. invadome provides some reassurance that
this is a relatively specific erythrocyte invadome.
However, two of the non-included genes with the "entry intro host"
annotation do have clear roles in erythrocyte invasion (falstatin ICP,
[153] and DOC2 [62]). These represent false-negatives, but they ap-
pear relatively few in number.
The other concern is false positives genes which have no real
role in invasion. A small number of clear false positives stood out
from the rest of the invadome. Pantothenate transporter (PAT), folate
transporter 1 and FCP (which is involved in the food vacuole [138])
all seemed unlikely to actually be involved in invasion.
One undefined question is how intimately a gene has to be involved
in the entry into the host cell to be considered part of the invadome.
There were four ApiAP2 genes in the gene list, these bind to DNA
and so are unlikely to be directly involved in invasion but might well
play key roles in regulating invasion-related genes which would, in
my view, make them rightful inclusions. There were also a number of
genes included with known roles in egress (PKG, SUB1), and it is pos-
sible the PlasmoINT approach does not have the ability to separate
egress and invasion. But these processes are interconnected and may
share components and so analysing them together may be helpful.
3.1.3 A number of PlasmoINT predictions have been borne out by subse-
quent research
There are a number of examples I uncovered in my research of the
invasion literature that anecdotally suggest the predictive power of
the analysis method that produced the PlasmoINT network. Since
the Hu et al. invadome was described in 2010, a number of genes
68 initial investigations: gene selection and P. BERGHEI screen
have been newly identified to have roles in invasion. Several such
discoveries in recent years were in fact foreshadowed in the Hu et al.
list.
Though the role of MyoA in invasion has been clear for a long
time, Myosin B, and its light chain MLC-B, have only recently been
discovered to have expression and localisation highly suggestive of
an involvement in invasion; [211] both were predicted by Hu et al. to
be involved in invasion purely on the basis of the interaction network.
In addition, the traditional model of the Plasmodium motor complex,
inspired by Toxoplamsa, involved aldolase acting as the link between
the actin cytoskeleton and adhesin proteins. [152] Interestingly the
Hu et al. invasion subnetwork does not include aldolase. Until last
year I would have described such a result as a false-negative. However
in 2014 renewed research in Toxoplasma identified aldolase as essential
for metabolism but not invasion [173]. In light of this, a new candidate
glideosome-adhesin connector (TgGAC) has been suggested to play
this key role across the Apicomplexa. [95] The P. falciparum orthologue
of this gene had already been included in the Hu et al. invadome,
though it was at the time a wholly uncharacterised protein. Thus
the Bayesian approach excluded a false-positive in the literature and
instead included the real interactor.
In sum, though the PlasmoINT invadome is clearly not a perfect
list of the invasion machinery, it appears a good starting point for
a screen-based approach to invasion. While a curatorial approach
would yield an invadome more in line with the literature, it would
exclude the genes that it is most important to investigate those we
currently know nothing about.
3.2 selecting the core invadome
Hu et al. defined their putative invadome with 418 P. falciparum genes,
but I wanted to analyse the core invasion machinery shared among
many Plasmodium species. To identify this core invadome, I searched
for those members of the Hu et. al invadome showing 1:1:1 orthol-
ogy between P. falciparum, P. berghei and P. knowlesi. I used a phyletic
pattern search on OrthoMCLDB [118] to retrieve this dataset, sum-
marised in Fig. 19. The search string used was “pber+pfal+pkno=3T
3.2 selecting the core invadome 69
AND pber+pfal+pkno=3”. The first half of the expression requires that
an ortholog group contains at least one member from each of the
three target species. Given this, the second half of the expression en-
sures that they do not contain more than one. This approach would,
I hypothesised, select genes which for the most part played identical
roles in the three organisms in question.
P. knowlesi
196
135
100127
4,082
134
79
P. falciparum P. berghei
Figure 19: The majority of the Plasmodium genome is 1:1:1 orthologous across P.
berghei, P. falciparum and P. knowlesi.
Note that there are two reasons an orthologue group might be
excluded: it could contain no members in the target organism or
it could contain more than one member.
I then identified the subset of these 4,082 genes contained within
the 418 gene Bozdech invadome. This yielded 320 genes as members
of the ’core invadome’. I hypothesised that such an approach would
select the evolutionarily conserved ancestral genes involved in many
of the key processes of invasion. However I was aware that certain
important invasion genes would be excluded as a result. Invasion
takes place at the parasite-host interface and so involves a number of
rapidly evolving genes families which do not show 1:1:1 orthology.
Notable examples include invasion ligands, where there is a vast se-
lective pressure for the evolution of novelty and evolutionary arms
races take place between parasite and host. However the aim of this
study was to carry out a pan-Plasmodium assessment of conserved
70 initial investigations: gene selection and P. BERGHEI screen
Not essential Deletion failed Not attempted
Figure 20: Prior to this study few attempts to disrupt genes in the core invadome
had been reported.
Each icon represents a single gene. Its color represents known tar-
getability in P. berghei prior to this work. Data are from RMGMdb.
[105]
genes and the >75% of the invadome that is included in this set is a
good list of candidates for such analysis.
1
3.2.1 Few invadome genes have been previously analysed experimentally
in P. berghei
To put into context my experimental work I first searched for what
was known already about the targetability of these genes.
As of July 2015, the RMGM database contained records for at-
tempts to knock out 52 of these 320 genes in P. berghei, with 23 of these
attempts successful and 29 unsuccessful (Fig. 20). Less than 20% iof
the core invadome has therefore previously been studied using exper-
imental genetics, consistent with the limited extent of genetic studies
across the entire genome.
This is likely a slight underestimate as the database may not be
comprehensive. If data from P. falciparum are included also then tar-
getability is described for an additional 18 genes, bringing the to-
tal coverage to 22%. However because P. falciparum has been less
amenable to transfection we can be less confudent in the calls of ’es-
sential’ genes in this species. Regardless, it is clear that high-throughput
1 8 orphan genes were added back for P. berghei phenotyping: PBANKA_031660, PBANKA_090080
PBANKA_100010 PBANKA_133270 PBANKA_134910 PBANKA_110140 PBANKA_102250
3.3 barseq in p. berghei 71
reverse genetics had the potential to reveal a great deal more about
this set of genes.
3.3 barseq in p. berghei
The PlasmoGEM project has proceeded furthest in P. berghei, because
of the ease with which the parasite’s genome can be manipulated.
[154] Knock-out vectors exist for the majority of genes, with coverage
growing each week, and so this system seemed the best starting point
for a high-throughput investigation of the invadome. In particular,
the barseq protocol had been established in P. berghei and used to
great effect. [79]
3.3.1 Barseq experiment 1
I will discuss the first barseq experiment in some detail to establish
the procedure followed and its strengths and weaknesses.
All P. berghei vectors were produced by the PlasmoGEM recombi-
neering team using the plate-based recombineering method. [155] I
prepared a pool of 48 vectors (listed in appendix A.1) for this exper-
iment; these were mixed with the seven growth rate control vectors
(targeting p25, p28, soap, p230p, plasmepsin IV, PBANKA_140160 and
PBANKA_110420) and transfected into parasites which were injected
into three mice, as described in section 2.3.
2
All three mice devel-
oped a parasitaemia under pyrimethamine pressure which became
visible on day 4 and steadily increased. Small blood samples were
taken daily from day 4 to day 8 for DNA extraction. I amplified bar-
codes from these samples and they were sequenced on an Illumina
flow cell, along with those from the input DNA.
3.3.1.1 Barseq generates a rich dataset
In the first stage of analysis I counted barcodes belonging to each
vector at each timepoint in each mouse and normalised these to that
sample’s total number of barcodes, to determine the percentage com-
2 All steps involving animals were conducted by Ellen Bushell, Ana Rita Gomes and
Tom Metcalf
72 initial investigations: gene selection and P. BERGHEI screen
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
input
day 4
day 5
day 6
day 7
day 8
Gene targeted
Ratio
Figure 21: Mutant abundances at each timepoint in one mouse
This plot illustrates the rich data provided by a single animal in
a barseq experiment. 324 data points represent the proportion of
each barcode at each time point. Three vectors are highlighted,
one essential (red), one redundant (green) and one with attenu-
ated growth (yellow). Control genes are indicated in grey. This
figure is intended as an illustration of the data provided by a sin-
gle mouse, but raw barcode ratios are shown for every gene in
every mouse in the appendix.
position of each knock-out in the population. These raw ratios from
a single mouse can be seen in Fig. 21.
The proportion of each vector in the input is relatively constant,
indicating that the pooled midiprep procedure (2.3.1) has not overly
biased our experiment. There are 7 genes with lower input values
these are in fact the 7 controls that were added in. There must have
been an error of quantitation, either an overestimation of these con-
trols or an underestimation of the other vectors which has resulted in
these spike-ins being present at 11% of the abundance of other vec-
tors. Nonetheless this did not prevent the experiment from working,
as these genes still gave correct growth phenotypes (slow/redundant
depending on the vector).
Despite the presence of barcodes for all genes of interest in the
input, by day 4 many vectors show no barcodes whatsoever (e.g.
PBANKA_133890). This is because naked DNA is cleared rapidly
from the plasma in vivo [171], thus the only way in which a barcode
barseq in P. BE RGHEI 73
can persist is if it is carried inside a parasite. P. berghei has a high
propensity for integrating linear homologous DNA into its genome,
meaning that vectors persist by knocking out their target genes (there
is no evidence for episomal carriage of PlasmoGEM vectors in P.
berghei). Where this target gene is essential parasites will die, and ei-
ther lyse or be cleared by the spleen, removing the barcodes from the
circulation. Thus we can speculate that those genes which show no
barcodes on day 4 may be essential.
The analysis is not completely straightforward, however. We see
that even where mutants grow at similar rates, and have similar input
ratios (e.g. compare PBANKA_031480 and PBANKA_133580), they
may have quite different representations on day 4. This demonstrates
that different vectors integrate into the parasite genome with differ-
ent efficiencies. Efficiency is in part a factor of the length of vector
arms [154] but also may be affected by the state of chromatin around
the gene. We therefore cannot exclude the possibility that we have
false negative results from vectors which are unable to integrate. Dis-
proving this possibility, and confirming essentiality, always requires a
conditional or complementation approach. That said, previous proof-
of-concept experiments for the barseq technique showed concordance
previous screens. [79] To be accurate however, where barcode levels
are below the detection threshold by day 7 genes are described as
“likely essential”, rather than essential, which cannot be defined by
this technique.
By looking at the shape of the graph beyond day 4 we can begin
to get a sense of the dynamics of growth occurring within the mouse.
Some genes show barcode counts which increase continually over the
course of the infection (e.g. PBANKA_031480); these are likely to be
almost entirely redundant.
Others appear to decline in proportion over the course of the in-
fection (e.g. PBANKA_136440). These represent slow-growing knock-
out parasites. Although they are growing, and so persist to day 4 and
beyond, they are continually being outcompeted by the redundant
knock-outs, such that they would ultimately drop below detectable
74 initial investigations: gene selection and P. BERGHEI screen
0.0
0.5
1.0
1.5
PBANKA_010290
PBANKA_020460
PBANKA_030600−ko
PBANKA_030670
PBANKA_031140
PBANKA_041600
PBANKA_052170
PBANKA_072050
PBANKA_080220
PBANKA_080960
PBANKA_082870
PBANKA_083560
PBANKA_090380
PBANKA_091030
PBANKA_091170
PBANKA_102010
PBANKA_103540
PBANKA_111530
PBANKA_113780
PBANKA_122740
PBANKA_133890
PBANKA_122440
PBANKA_072150
PBANKA_140160
PBANKA_082490
PBANKA_124060
PBANKA_110420
PBANKA_120200
PBANKA_103440
PBANKA_081700
PBANKA_110650
PBANKA_090590
PBANKA_100240
PBANKA_051490
PBANKA_136440
p230p−tag
PBANKA_142310
PBANKA_123730
PBANKA_093210
PBANKA_062240
PBANKA_112810
PBANKA_145910
PBANKA_120800
PBANKA_031480
PBANKA_122540
PBANKA_133660
PBANKA_103780
PBANKA_130520
PBANKA_051520
PBANKA_146060
PBANKA_051500
PBANKA_133580
PBANKA_110690
Gene targeted
Fold-change
Mouse
1
2
3
Day
5 - 6
6 - 7
7 - 8
Figure 22: Fold-changes of all genes in barseq experiment 1
Error bars depict 95% confidence interval calculated across all
days and all mice. This diagnostic plot allows a simple assess-
ment of whether growth effects apply across all mice and of
whether such effects change over the course of the infection.
There is more uncertainty over the growth rate of slow-growing
mutants than redundant ones because these growth rates are in-
ferred from smaller numbers of parasites, leaving noise propor-
tionally higher.
levels by barseq (despite the fact that the actual number of parasites
is steadily increasing).
3
Finally, a group of genes have no barcodes by day 7, and are called
as ”Likely essential”.
These dynamics define the limit of the most extreme phenotypes
that can be detected. An extremely slowly growing parasite might
already be competed to below the detection threshold early in the
experiment, leading it to be measured as putatively essential.
3.3.1.2 Barseq assigns numerical values to parasite fitness
The style of analysis above is qualitative, but helpful for understand-
ing the strengths and limitations of the barseq approach. However a
key strength of barseq is the ability to construct an analysis pipeline
which with minimal human intervention gives robust and highly
quantitative phenotyping results.
3 The steady nature of the trend in parasite abundance, and the consistency between
mice, demonstrates that these effects are the result of selection and not random
genetic drift.
barseq in P. BE RGHEI 75
0.0
0.5
1.0
1.5
2.0
4 5 6 7
Day
Fold-change
0.6
0.8
1.0
1.2
Mean fold-change
Figure 23: Line graph depicting how fold-changes themselves change over time
Each line depicts a gene, its colour determined by its mean fold
change. It is clear that for likely redundant genes, with fold-
changes around 1, the most consistent data comes towards the
end of the infection where there are large numbers of mutant
parasites, reducing the effect of noise.
The approach I adopted, highly inspired by [80] but re-implemented
in a more self-contained bioinformatic approach, is described in 2.4.1.
I developed a pipeline which in a single click can go from the fastQ
output of an Illumina sequencing machine to assign phenotypes ”Likely
essential”, ”Redundant” and ”Slow growth” and also along the way
produce a number of diagnostic graphs.
The first stage is to compute fold-changes in barcode ratios from
one day to the next, i.e. day 4 to day 5, day 5 to day 6, day 6 to
day 7 and day 7 to day 8. These values are shown in Fig. 22 (one
of the diagnostic plots produced by the script). We can see that on
this basis alone we can get a good sense of whether a gene appears
to be redundant, likely essential or will result in slow growth when
disrupted.
However, these fold-change data are not a consistent value of a par-
ticular knock-out but rather the result of an interaction between the
fitness of the mutant and the background against which it is compet-
ing; a mutant can only increase in relative abundance at the expense
of some slower growing mutant. Therefore the distribution of fold-
changes shifts over the course of the infection.
This effect is shown in Fig. 23. On day 4 redundant genes have fold-
changes substantially above 1, as they increase at the expense of slow
76 initial investigations: gene selection and P. BERGHEI screen
Day
Fold- change
4 5 6 7
0.0
0.5
1.0
1.5
2.0
4
8
12
PBANKA_140160
PBANKA_110420
PBANKA_103440
PBANKA_103780
PBANKA_051500
p230p-tag
PBANKA_051490
Figure 24: Fold-change values for the six control constructs are as predicted.
Values shown are the mean from three mice. Predicted (and
demonstrated in [80]) to be normally-growing mutants(shown
in green) do indeed have higher fold-changes than attenuated-
growth controls (shown in yellow).
growing mutants. By the end their fold-change drops towards 1, be-
cause slow-growing mutants are such a small part of the population
that there is very little change in the proportion of normally-growing
mutants over time (e.g. a change from 98% to 99%).
Because of this effect the fold-change can tell us only about the
relative growth of different mutants on each day of the experiment.
3.3.1.3 Control genes allow the normalisation of fitness values
To gain an absolute view of mutant’s fitness, we can normalise its
growth to mutants generated in the same experiment but with vectors
which produce known fitness effects. The same 7 vectors are added
to each barseq transfection to act as a universal comparator between
experiments (see section 2.3.1).
We can analyse these genes as an initial check on the experiment
(Fig. 24). I found that the slow-growth controls did indeed grow more
slowly than their redundant counterparts, giving some confidence in
the results of the experiment. These results reflect the ratio between
the abundance on two days. The abundances on the final days are
the result of compounding these results over all preceding days and
so the graphs shown in the appendix show a markedly higher final
abundance for the normal-growth controls as compared to the slow
controls.
3.4 barseq growth phenotypes 77
One can also see that there is a ’sweet-spot’ for the acquisition of the
best growth phenotype data. On day 4 there is a great deal of noise,
likely due to PCR amplification from a low number of molecules and
on day 7 the proportion of slow growing parasites is becoming very
low indeed leading to less accurate results.
4
With these control genes behaving as anticipated, I used them to
normalise my data. I calculated the mean fold-change of the three
redundant knockouts in each mouse at each timepoint. This can be
set to a fitness of 1, and all fold-changes normalised on this basis.
The fitness values on day 4 are unreliable due to the low para-
sitaemia at this point, so these were discarded. The remaining values
were used to compute a mean fitness for each gene, and to conduct
a t-test against the redundant controls to test for statistically signif-
icant attenuated growth. The resulting p-values were corrected for
multiple comparisons and based on them phenotypes were assigned
as described in 2.4.1.
3.3.2 Barseq experiment 2
Invadome barseq experiments were carried out in two batches. The
first, of 48 vectors, was conducted as described above. The second
was carried out to phenotype an additional 97 vectors, with the same
procedure previously described, and early checks again suggested
that the experiment was working correctly (data not shown).
Exactly the same analysis procedure was carried out to identify
statistically significant mutant growth phenotypes, and henceforth I
will discuss the entire dataset together.
3.4 barseq growth phenotypes
Both datasets were analysed separately as above and then their growth
phenotypes collated to produce the final dataset. In total I was able to
assign potential growth phenotypes to 145 putative invasion-related
genes. Twenty-five of these already had a growth phenotype described
4 In fact these effects are exacerbated in these genes in this particular experiment,
because the control genes were spiked in at a lower level than other vectors. The
fact that results for these genes are still accurate illustrates the robustness of the
methodology.
78 initial investigations: gene selection and P. BERGHEI screen
on RMGMdb. The remainder are to my knowledge being targeted for
the first time in P. berghei.
The results are shown in Table 4. Around half of all genes were
found to be likely essential. Of those that were targetable, 81% showed
no detectable growth phenotype with the remainder showing attenu-
ated growth. We did not detect any genes whose deletion significantly
increased growth rates.
Table 4: Barseq growth phenotypes for invadome genes
putative growth
id description phenotype rate
PBANKA_141980 3’,5’-cyclic nucleotide phosphodiesterase, puta-
tive
Essential
PBANKA_122670 AAA family ATPase, putative Essential
PBANKA_122760 adenylyl cyclase beta, putative (ACbeta) Essential
PBANKA_050410 autophagy-related protein 8 (ATG8) Essential
PBANKA_083560 cAMP-dependent protein kinase catalytic sub-
unit (PKAc)
Essential
PBANKA_113780 conserved Plasmodium protein, unknown func-
tion
Essential
PBANKA_072150 conserved Plasmodium protein, unknown func-
tion
Essential
PBANKA_080960 conserved Plasmodium protein, u. f. Essential
PBANKA_092940 conserved Plasmodium protein, u. f. Essential
PBANKA_081510 conserved Plasmodium protein, u. f. Essential
PBANKA_050180 conserved Plasmodium protein, u. f. Essential
PBANKA_135610 conserved Plasmodium protein, u. f. Essential
PBANKA_040640 conserved Plasmodium protein, u. f. Essential
PBANKA_030110 conserved Plasmodium protein, u. f. Essential
PBANKA_100920 conserved Plasmodium protein, u. f. Essential
PBANKA_092250 conserved Plasmodium protein, u. f. Essential
PBANKA_010260 conserved Plasmodium protein, u. f. Essential
PBANKA_112410 conserved Plasmodium protein, u. f. Essential
PBANKA_040690 conserved Plasmodium protein, u. f. Essential
3.4 barseq growth phenotypes 79
putative growth
id description phenotype rate
PBANKA_091940 conserved Plasmodium protein, u. f. Essential
PBANKA_134700 conserved Plasmodium protein, u. f. Essential
PBANKA_051390 conserved Plasmodium protein, u. f. Essential
PBANKA_100170 conserved Plasmodium protein, u. f. Essential
PBANKA_141380 conserved Plasmodium protein, u. f. Essential
PBANKA_010630 conserved Plasmodium protein, u. f. Essential
PBANKA_141200 conserved Plasmodium protein, u. f. Essential
PBANKA_101040 conserved Plasmodium protein, u. f. Essential
PBANKA_122440 ferlin like protein, putative Essential
PBANKA_133890 glideosome associated protein with multiple
membrane spans 1, putative (GAPM1)
Essential
PBANKA_103540 glideosome associated protein with multiple
membrane spans 3, putative (GAPM3)
Essential
PBANKA_111530 glideosome-associated protein 40, putative
(GAP40)
Essential
PBANKA_091030 guanylyl cyclase, putative (GCalpha) Essential
PBANKA_040180 HAD superfamily protein, putative Essential
PBANKA_124060 membrane skeletal protein, putative Essential
PBANKA_083100 merozoite surface protein 1 (MSP1) Essential
PBANKA_135570 myosin A (MyoA) Essential
PBANKA_145950 light chain 1, putative,myosin A tail domain in-
teracting protein MTIP, putative (MTIP)
Essential
PBANKA_082870 phosphatidylserine decarboxylase, putative Essential
PBANKA_020460 photosensitized INA-labeled protein 1, puta-
tive
Essential
PBANKA_102010 PP1-like protein serine/threonine phosphatase,
putative
Essential
PBANKA_131540 protein phosphatase 2b regulatory subunit, pu-
tative
Essential
PBANKA_122740 protein phosphatase, putative Essential
PBANKA_041600 RhopH3, putative Essential
80 initial investigations: gene selection and P. BERGHEI screen
putative growth
id description phenotype rate
PBANKA_080450 rhoptry associated membrane antigen, puta-
tive (RAMA)
Essential
PBANKA_093200 rhoptry neck protein 4, putative (RON4) Essential
PBANKA_071310 rhoptry neck protein 5, putative (RON5) Essential
PBANKA_030950 secreted protein with altered thrombospondin
repeat domain, putative (SPATR)
Essential
PBANKA_080220 serine/threonine protein kinase, putative Essential
PBANKA_090380 serine/threonine protein kinase, putative Essential
PBANKA_031140 serine/threonine protein kinase, putative Essential
PBANKA_010290 SPE2-interacting protein, putative (SIP2) Essential
PBANKA_091170 subtilisin-like protease 2 (SUB2) Essential
PBANKA_143360 thrombospondin-related apical membrane pro-
tein (TRAMP)
Essential
PBANKA_052170 transcription factor with AP2 domain, putative
(ApiAP2)
Essential
PBANKA_133270 Duffy-binding protein* Essential
PBANKA_135740 conserved Plasmodium protein, u. f. Essential
PBANKA_092090 conserved Plasmodium protein, u. f. Essential
PBANKA_092560 conserved Plasmodium protein, u. f. Essential
PBANKA_130430 conserved Plasmodium protein, u. f. Essential
PBANKA_103970 conserved Plasmodium protein, u. f. Essential
PBANKA_134490 conserved Plasmodium protein, u. f. Essential
PBANKA_144290 conserved Plasmodium protein, u. f. Essential
PBANKA_030450 conserved Plasmodium protein, u. f. Essential
PBANKA_133460 diacylglycerol kinase, putative Essential
PBANKA_092510 leucine-rich repeat protein (LRR11) Essential
PBANKA_082490 patatin-like phospholipase, putative Essential
PBANKA_083300 profilin, putative (PFN) Essential
PBANKA_061970 rhoptry-associated leucine zipper-like protein
1, putative (RALP1)
Essential
3.4 barseq growth phenotypes 81
putative growth
id description phenotype rate
PBANKA_103210 rhoptry-associated protein 1, putative (RAP1) Essential
PBANKA_030670 transporter, putative Essential
PBANKA_093690 actin-like protein, putative (ALP1) Slow 0.68
PBANKA_091150 conserved Plasmodium membrane protein, u. f. Slow 0.82
PBANKA_136440 conserved Plasmodium protein, u. f. Slow 0.89
PBANKA_092170 conserved Plasmodium protein, u. f. Slow 0.78
PBANKA_113880 conserved Plasmodium protein, u. f. Slow 0.66
PBANKA_080130 conserved Plasmodium protein, u. f. Slow 0.80
PBANKA_120200 membrane skeletal protein, putative Slow 0.72
PBANKA_051140 peroxiredoxin, putative (nPrx) Slow 0.73
PBANKA_100240 protease, putative Slow 0.83
PBANKA_083290 Protein MAM3, putative Slow 0.67
PBANKA_110650 rhomboid protease, putative (ROM4) Slow 0.72
PBANKA_090590 transcription factor with AP2 domain(s) (AP2-
O)
Slow 0.68
PBANKA_110140 RAP 2/3 Slow 0.76
PBANKA_110760 6-cysteine protein (P38) Redundant 1.07
PBANKA_100550 AAA family ATPase, putative Redundant 0.98
PBANKA_100360 apical sushi protein, putative (ASP) Redundant 0.88
PBANKA_130340 CCAAT-binding transcription factor, putative Redundant 0.99
PBANKA_090260 CCAAT-box DNA binding protein subunit B,
putative (NFYB)
Redundant 0.97
PBANKA_144720 CG2-related protein, putative Redundant 0.98
PBANKA_031480 conserved Plasmodium protein, u. f. Redundant 1.05
PBANKA_146060 conserved Plasmodium protein, u. f. Redundant 1.07
PBANKA_110690 conserved Plasmodium protein, u. f. Redundant 1.09
PBANKA_122540 conserved Plasmodium protein, u. f. Redundant 1.05
PBANKA_145910 conserved Plasmodium protein, u. f. Redundant 1.03
PBANKA_133660 conserved Plasmodium protein, u. f. Redundant 1.05
PBANKA_112010 conserved Plasmodium protein, u. f. Redundant 0.96
82 initial investigations: gene selection and P. BERGHEI screen
putative growth
id description phenotype rate
PBANKA_100250 conserved Plasmodium protein, u. f. Redundant 1.05
PBANKA_131420 conserved Plasmodium protein, u. f. Redundant 0.96
PBANKA_090610 conserved Plasmodium protein, u. f. Redundant 0.88
PBANKA_050720 conserved Plasmodium protein, u. f. Redundant 1.04
PBANKA_050690 conserved Plasmodium protein, u. f. Redundant 1.04
PBANKA_121060 conserved Plasmodium protein, u. f. Redundant 1.00
PBANKA_071660 conserved Plasmodium protein, u. f. Redundant 0.77
PBANKA_082160 conserved Plasmodium protein, u. f. Redundant 0.94
PBANKA_021450 conserved Plasmodium protein, u. f. Redundant 1.02
PBANKA_121440 conserved Plasmodium protein, u. f. Redundant 1.01
PBANKA_070200 conserved Plasmodium protein, u. f. Redundant 1.07
PBANKA_121180 conserved Plasmodium protein, u. f. Redundant 1.04
PBANKA_100530 conserved Plasmodium protein, u. f. Redundant 0.91
PBANKA_041780 conserved Plasmodium protein, u. f. Redundant 1.07
PBANKA_090840 conserved Plasmodium protein, u. f. Redundant 0.94
PBANKA_091240 conserved Plasmodium protein, u. f. Redundant 1.07
PBANKA_134640 conserved Plasmodium protein, u. f. Redundant 0.88
PBANKA_102760 conserved Plasmodium protein, u. f. Redundant 0.98
PBANKA_143910 conserved Plasmodium protein, u. f. Redundant 0.92
PBANKA_080250 conserved Plasmodium protein, u. f. Redundant 1.04
PBANKA_090860 conserved Plasmodium protein, u. f. Redundant 1.02
PBANKA_140920 conserved Plasmodium protein, u. f. Redundant 1.12
PBANKA_120180 conserved Plasmodium protein, u. f. Redundant 0.87
PBANKA_131280 conserved Plasmodium protein, u. f. Redundant 0.88
PBANKA_103170 conserved Plasmodium protein, u. f. Redundant 0.88
PBANKA_145870 conserved Plasmodium protein, u. f. Redundant 0.91
PBANKA_093210 DHHC-type zinc finger protein, putative Redundant 0.97
PBANKA_133580 disulfide-isomerase, putative Redundant 1.07
PBANKA_101910 DNA-directed DNA polymerase, putative Redundant 1.08
PBANKA_092970 haloacid dehalogenase-like hydrolase, putative Redundant 1.05
3.4 barseq growth phenotypes 83
putative growth
id description phenotype rate
PBANKA_123730 inner membrane complex protein, putative Redundant 0.97
PBANKA_062240 kinesin, putative Redundant 0.99
PBANKA_091510 LEM3/CDC50 family protein, putative Redundant 1.05
PBANKA_051520 MORN repeat-containing protein 1, putative
(MORN1)
Redundant 1.07
PBANKA_110330 myosin pfm-b, putative Redundant 1.00
PBANKA_051535 OTU-like cysteine protease, putative Redundant 1.06
PBANKA_112810 phospholipase (PL) Redundant 1.02
PBANKA_142310 phospholipase, putative Redundant 0.96
PBANKA_072070 regulator of chromosome condensation, puta-
tive
Redundant 1.01
PBANKA_011160 rhoptry protein, putative (ROP14) Redundant 1.05
PBANKA_093190 serine esterase, putative Redundant 1.01
PBANKA_130520 serine/threonine protein kinase, putative Redundant 1.06
PBANKA_112390 sphingomyelin synthase 1, putative (SMS1) Redundant 1.07
PBANKA_081700 sugar transporter, putative Redundant 0.90
PBANKA_120800 tubulin-tyrosine ligase, putative Redundant 1.04
PBANKA_145700 ubiquitin-conjugating enzyme, putative Redundant 1.10
PBANKA_071270 zinc finger protein, putative Redundant 1.06
PBANKA_031660 RBP, putative Redundant 1.05
PBANKA_100010 RBP, putative Redundant 1.04
3.4.1 Comparison with previous reverse genetic data
Data for a limited number of P. berghei knockouts was already avail-
able on the RMGMdb database. This allowed me to quickly establish
whether phenotypes of putative invasion-related genes were as one
would predict from this literature. In many cases the results of my
experiment agreed with previously published data.
For the 17 genes (of 25 for which there were previous reported P.
berghei transfection attempts) shown in Table 5 we detected pheno-
84 initial investigations: gene selection and P. BERGHEI screen
types identical to those previously described in literature, either by
finding the gene refractory to deletion, or to have a growth rate indis-
tinguishable from that of a redundant gene. For two further genes we
agreed on targetability but disagreed on whether growth was attenu-
ated, giving an agreement rate on targetability of 76%.
Table 5: Genes for which barseq phenotype agrees with previous data
In each redundant case, the growth rate is statistically indistinguish-
able from redundant controls. In each refractory case barcodes are
not detected in at least 2 timepoints for two out of three mice.
*: Attempt by Ghosh S; de Koning-Ward TF failed (unpublished
data, via RMGMdb)
gene id consensus ref.
DHHC9 PBANKA_093210 Redundant [65]
P38 PBANKA_110760 Redundant [198]
Pb235
5
PBANKA_031660 Redundant [64]
Phospholipase PBANKA_112810 Redundant [197]
GAMER PBANKA_122540 Redundant [3]
Pp7 PBANKA_1020100 Refractory [87]
Sub2 PBANKA_0911700 Refractory [197]
Pka PBANKA_083560 Refractory [189]
Guanylyl cyclase PBANKA_0910300 Refractory [91]
Ser/thr protein ki-
nase
PBANKA_0311400 Refractory [189]
RON4 PBANKA_093200 Refractory [76]
Profilin PBANKA_083300 Refractory [113]
TRAMP PBANKA_143360 Refractory [191, 64]
RAP1 PBANKA_103210 Refractory [76], *
MSP-1 PBANKA_083100 Refractory [39]
GAPM1, PSOP3 PBANKA_1338900 Refractory [59]
Ser/thr protein ki-
nase
PBANKA_0903800 Refractory [189]
CNA PBANKA_1227400 Refractory [87]
There were also several conflicts with previous literature. Some of
these are potentially easily explained, others less so.
The serine/threonine protein kinase PBANKA_130520 was previ-
ously described as essential in P. berghei. [189] However, there was
some evidence for integration (although a clone could not be gener-
ated) and the gene has been successfully knocked out in P. falciparum
3.4 barseq growth phenotypes 85
[181]. It was also detected as potentially redundant by a previous
barseq experiment [80]. I detected it as likely redundant by barcode
sequencing and hypothesised based on the sum of the evidence that
a vector-specific problem may have resulted in the failure to disrupt
reported in [189]. To confirm this result an individual transfection
was carried out with the PlasmoGEM vector targeting this gene and
I confirmed by qPCR that the knock-out was successful (Fig. 26).
The dipeptidyl aminopeptidase DPAP3 plays a crucial role in asex-
ual growth in P. falciparum, where it is implicated as important to al-
low egress. [53] However the Leiden Malaria Group report (on RMG-
Mdb) that PBANKA_100240 could be knocked out with normal growth
detected from the increase in levels of haemozoin in the blood. We de-
tect slow growth, at a rate of 82%. This is a modest growth reduction
but is highly significant (p 0.006). I hypothesise that either this re-
duction in growth was too slight to be detected in the previous study,
or potentially that it only appears in a competitive environment.
The ApiAP2 protein AP2-O (PBANKA_090590) is so named be-
cause its primary role is to regulate the ookinete stage of the parasite.
[210] It was therefore a surprise to detect slow growth (rate=0.68) of
this mutant. The canonical model of AP2-O is that it is transcribed in
gametocytes but repressed by DOZI until the ookinete stage, where it
activates a raft of genes important for the ookinete. So why a growth
phenotype in the blood stages? My first explanation was that per-
haps a neighbouring gene important in the blood stages had a UTR
which extended into AP2-O and this was disrupted in the knock-out.
I therefore looked at strand-specific RNA-Seq from the P. falciparum
orthologue (Fig. 25). I found that this was not the case but I did ob-
serve that the gene is significantly transcribed in the schizont stage.
We do know that the AP2-O protein is not visible in the blood stages
when tagged with GFP [210], but this does not rule out a low level of
protein being present, which could still play a significant regulatory
role. The fact that expression peaks in schizonts correlates (perhaps
coincidentally) with the fact that both the merozoite and the ookinete
are invasive.
PBANKA_113780, a protein containing Armadillo domains rela-
tively uncharacterised in Plasmodium, was present in the input pool
at 2%, but below the detection threshold in all mice by day 6. This
would lead us to believe it to be likely essential. However it is re-
86 initial investigations: gene selection and P. BERGHEI screen
F
R
F
R
Schizont
Stage V
gametocyte
1 kb
AP2-O
Figure 25: The AP2-O locus in P. falciparum, shows schizont expression and is not
overlapped by any other transcript.
Strand-specific RNA-Seq data from [128], adapted from Plas-
moDB. The orientation has been reversed from genome orienta-
tion so that AP2-O is facing forwards. ’F’ indicates forward with
respect to AP2-O, ’R’ reverse. No genes have UTRs which overlap
AP2-O.
ported to have been knocked out previously in P. berghei on the basis
of pulsed-field gel electrophoresis analysis. [105] This protein is the
orthologue of the TgGAC recently reported as essential in Toxoplasma,
which suggests it might also be essential in Plasmodium (the previous
positive PFGE result could be the result of a segmental duplication).
However more experiments will be needed to confirm which experi-
ment’s results reflect reality.
Previously rhoptry-associated protein 2/3 (RAP2/3, PBANKA_110140)
could not be knocked out over three attempts in P. berghei [196]. This
was in a sense a surprising result since RAP1 can be deleted in P.
falciparum [13] resulting in a failure to localise RAP2, without lethal
effects upon the parasite (and RAP3 is redundant in P. falciparum).
We observed apparent slow-growing RAP2/3 knock out parasites in
the barcode-sequncing data. A possible explanation is the higher effi-
ciency that PlasmoGEM vectors allow compared to conventional tech-
nologies. [154] I was able to generate parasites with RAP2/3 knock-
out by individual transfection. (Fig. 26)
Intriguingly RAP1 appeared essential in our P. berghei screen, and
was recently described as such with a single knock-out approach
[105]. This phenotype, combined with the fact that the total loss of
RAP2/3 is viable, suggests that RAP1 plays a role in P. berghei that is
more important than simply trafficking RAP2/3 to the right cellular
location.
IMC1c (PBANKA_120200) could not be knocked out in a previous
P. berghei attempt, [195] although the researchers were able to tag it
3.4 barseq growth phenotypes 87
with GFP, leading them to hypothesise its essentiality. Our data sug-
gest it to be attenuated, with a growth rate 71% of the wild-type. The
authors of the past study reported that parasites came up with in-
sertions of the resistance marker into non-specific genomic locations.
Perhaps the inconsistency can be explained by these rare parasites
outgrowing the slow-growing knock out parasites, however a single
attempt to knock out the gene with a PlasmoGEM vector did not give
rise to a significant knock out population leaving more investigation
needed.
A final inconsistency was that we detected ROM4 (PBANKA_110650)
knock-out parasites as attenuated, but viable. This gene has been pre-
viously reported as essential in P. berghei. [124] In P. falciparum, ROM4
was shown to process EBA-175, shedding it into the supernatant as
invasion occurs. This step is thought to be crucial to invasion because
modifications to EBA-175 which remove the ROM4-processing site
prevent parasite growth, and the ROM4 gene itself has been previ-
ously reported as refractory to deletion in both P. falciparum and P.
berghei. [148] The barseq result was therefore surprising and so was
followed up with a single transfection of the pJAZZ vector into a sin-
gle mouse. When DNA from the parasites that came up from this
transfection was assessed by qPCR there was indeed a major reduc-
tion in the amount of locus present (Fig. 26). The parasites also came
up two days after those of the other vectors transfected in parallel, a
result consistent with attenuated growth.
If the processing that ROM4 performs is an essential process, one
possible explanation for the gene’s non-essentiality is possible shared
substrate specificity between ROM4 and the other rhomboid proteases
of the parasite.
3.4.1.1 Barseq data for invadome genes largely agrees with previous exper-
imental results
A summary of all genes with data both from RMGMdb and barseq is
shown in Fig. 27. For 72% of genes the barseq result entirely agrees
with that seen previously, and only in 8% does barseq reverse a previ-
ous result. In the remaining cases barseq suggests attenuated growth
where previous techniques suggested redundancy or essentiality; three
such results have been confirmed in individual transfections.
88 initial investigations: gene selection and P. BERGHEI screen
P rim e r
Rela tive a b u n d a n ce b y q PC R
Se r/
t
hr k
i
n
as
e
R
OM4
C
o
n
tro
l
R
a
p 2/3
Se r/
t
hr k
i
n
as
e
R
OM4
C
o
n
tro
l
R
a
p 2/3
Se r/
t
hr k
i
n
as
e
R
OM4
C
o
n
tro
l
R
a
p 2/3
0
100
Ser/thr kinase KO
ROM4 KO
Rap 2/3 KO
Figure 26: Individual knock-outs of three genes confirms barseq results which con-
tradict previous studies
These data demonstrate three confirmed knock-outs by individ-
ual transfection of a pJAZZ vector. qPCR was conducted on para-
site DNA from uncloned populations. Arrows in each case iden-
tify the bar of interest. (The small apparent reduction in Ser/Thr
kinase level in the ROM4 KO is an artefact of the system, and a
low parasitaemia in this sample, rather than a real mistargeting
effect.)
’Ser-thr kinase’ indicates PBANKA_130520
These results gave me confidence that barseq was proving a sensi-
tive method to detect growth phenotypes in invasion-related genes. I
therefore turned my attention to the majority of the dataset for which
there was not previous reverse genetic data in P. berghei.
3.4.2 Fitness results by organellar localisation
One way in which to break down the growth phenotypes we obtained
by barseq is to group them by their localisation within the parasite.
I combined ApiLoc and RMGMdb data to synthesise the localisation
data that existed for my phenotyped proteins.
3.4.2.1 Analysis of the rhoptry suggests newly essential components
The phenotypes I identified for mutants knocking out proteins pre-
viously shown to localise to the rhoptries is shown in Fig. 28. This
shows the confirmed previous P. berghei refractory phenotypes for
RON4, RAP1 and TRAMP. RAP2/3 mutants appeared to produce at-
tenuated growth, in contrast to previous data (as discussed above).
3.4 barseq growth phenotypes 89
Prior (RMgmDB) Barseq
Possibly redundant
Attenuated growth
Possibly essential
Figure 27: Comparison of barseq phenotypes with previous RMGMdb phenotypes
This figure shows all genes analysed by barseq for which data
existed in RMGMdb prior to this experiment. Left hand side of
each gene represents status in RMGMdb, right hand side repre-
sents result by barseq.
Because of their known importance in invasion, the rhoptries have
been intensively studied in P. falciparum. I confirmed that previous
failures to knock out RhopH3, RAMA and RALP1 in this species ex-
tended to P. berghei, giving further confidence to the idea that these
are very likely essential.
I also found growth phenotypes for four genes which I believe have
not yet been characterised by reverse genetics in Plasmodium. RON5
and SPATR appeared refractory to deletion; confirming that the latter
likely plays a role beyond that already established in the sporozoite.
[34] Unpublished data (referenced in [132]) suggests that in addition
to HepG2 cells, PfSPATR also binds erythrocytes and anopheline cells.
If these interactions are functional, they suggest a role for this protein
across the parasite life cycle. ROP14 and ASP appeared to be dispens-
able without any measurable effect on fitness.
There are fewer non-redundant genes among those localised to the
rhoptry than in the invadome as a whole, however this is not a statis-
tically significant result.
3.4.2.2 The inner membrane complex: fitness for knock-outs of motor-related
proteins
My analysis included at least 12 inner membrane complex proteins,
very few of which had been previously examined in P. berghei (Fig. 29).
While previous analysis had occurred in Toxoplasma, it has been re-
cently discovered that major IMC components are entirely dispens-
able for invasion in this species, complicating the interpretation of its
phenotypes in a Plasmodium context. [61]
90 initial investigations: gene selection and P. BERGHEI screen
RhopH3
RAMA
RALP1
RON4
TRAMP
RAP1
RON5
SPATR
ROP14
ASP
RAP2/3
*
Prior (RMgmDB) Barseq
Possibly redundant
Attenuated growth
Possibly essential
Previously unreported
Figure 28: Barseq phenotypes of rhoptry-localised genes
This figure presents the information about rhoptry genes pro-
vided by this study. The left-hand side of each icon represents any
known phenotype before this study, the right-hand side the phe-
notype called by barseq. Three genes (RhopH3, RAMA, RALP1)
which previously were known to be refractory to deletion in P.
falciparum are now known to be likely essential in P. berghei also.
Two genes previously found to be refractory to deletion (RON4,
RAP1) are confirmed as such. I newly suggest two genes (RON5,
SPATR) as likely essential and two (ROP14, ASP) as redundant.
Where previously attempts to delete RAP2/3 failed in P. berghei I
detect deletion as merely attenuating growth.
I have already discussed the contradictory phenotype I observed
for IMC1c as compared to previous studies, and listed the confirmed
phenotypes, possibly essential and redundant respectively, for GAPM1
and DHHC9.
It was not surprising to identify GAPM3, GAP40, MyoA and MTIP
as likely essential. These are core components of the motor machinery
which drives the parasite into the erythrocyte. However in the light
of the recent discovery in Toxoplasma that some of these genes are
not absolutely required for invasion, a conditional approach might
be needed to check whether mutant parasites these are severely at-
tenuated or entirely non-invasive. [61]
Three of the phenotyped proteins were alveolins (IMC1c, IMC1f,
IMC1g). Two of these were targetable (IMC1c, IMC1f), but resulted
in attenuated growth. Given the structural similarity of the alveolins
one might speculate that there is a degree of functional redundancy
within these proteins. However IMC1g appears to be essential. Some
sources claim that PBANKA_123730 is also an alveolin [125] but very
little is known about this protein. I identified it as redundant in the
blood stages.
3.4 barseq growth phenotypes 91
GAPM3
GAP40
MyoA
GAPM1
IMC1g
MTIP
IMC1c
IMC1f
ALP1
DHHC9
MORN1
PBANKA_123730
Prior (RMgmDB) Barseq
Possibly redundant
Attenuated growth
Possibly essential
Previously unreported
Figure 29: Barseq phenotypes of IMC-localised genes
Five core glideosome components (GAPM1, GAPM3, GAP40,
MTIP, MyoA) were detected as likely essential, as was the alve-
olin IMC1g. IMC1c and IMC1f produced attenuation when singly
deleted, as did ALP1. MORN1, DHHC9 and PBANKA_123730
appeared redundant.
I identified MORN1 as redundant in the blood-stages. This is a po-
tentially surprising result given this protein is essential for correct
budding in Toxoplasma. [129] MORN1 is predicted by the Malaria Ad-
hesins and Adhesin-Like Protein Predictor (MAAP) [10] to be an adhesin
(or adhesin-like). (Score: 0.63). The predictor scores the Toxoplasma or-
thologue signficantly lower (0.04), perhaps hinting that this protein
might have been recruited into a new role in Plasmodium. If so its
redundant phenotype would be consistent with that of most other
adhesins.
The measurement of the ALP1 knock-out as attenuated also contra-
dicted previous T. gondii data describing it as essential. [82]
While there are fewer non-redundant genes among the IMC-associated
genes than in the invadome as a whole, this is not a statistically sig-
nificant result.
3.4.2.3 Other proteins of prior interest
There are a number of additional protein of interest for which we
obtained growth phenotypes, shown in Fig. 30.
MAM3 at first appeared a strange inclusion in the invadome its
name comes from its Saccharomyces cerevisiae homologue, which is a
mitochondrial organiser protein. However this protein actually has
similarity to haemolysins.
6
I therefore hypothesise that it may have
6 Information from S. cerevisiae genome description
92 initial investigations: gene selection and P. BERGHEI screen
Phil1
AP2
SIP2
Sub2
P38
MSP1
MyoB
Mlc-B
ROM4
AP2-O
Profilin
Phospholipase (PL)
Phosopholipase
Peroxiredoxin (nPrx)
MAM3
PBANKA_
052170
PBANKA_
142310
MyoB
complex
Merozoite
surface
Proteases IMC
related
Phospho-
lipases
DNA
associated
Misc.
Prior (RMgmDB) Barseq
Possibly redundant
Attenuated growth
Possibly essential
Previously unreported
Figure 30: Barseq phenotypesof miscellaneous additional known genes
Miscellaneous proteins approximately grouped by shared fea-
tures areas of biology.
a role in egress, as has recently been suggested for haemolysin III.
[143] This would predict that the slow growth of the mutant is due to
a defect in egress.
7
It has very recently been found that a second myosin, MyoB is lo-
calised separately to MyoA and based on a failure to knock-out its
interacting light chain this gene was reported to be essential in the
blood stage. [211] Barseq confirmed the essential nature of the rele-
vant Myosin light chain Mlc-B, but surprisingly identified MyoB as
redundant. However given this combination of phenotypes is incon-
sistent with the model reported in [211], the results should be treated
with skepticism until a cloned knockout of MyoB can be generated.
3.4.2.4 Barseq probes many previously uncharacterised proteins
The advantage of a large scale screening approach is that it assigns
growth phenotypes not only the proteins listed above, about which a
certain amount is known already, but also the bulk of proteins about
which we know very little.
These experiments have assigned growth phenotypes to 104 mem-
bers of the core invadome which were previously almost entirely un-
characterised experimentally. This dataset will be useful in two re-
7 One could speculate that the protein is not essential because its role can be partially
complemented by haemolysin III. It would be interesting to see whether the double
knock-out of haemolysin III and MAM3 is viable.
3.5 assessment of barseq phenotypes en-masse 93
Possibly
essential
71
Redundant
60
Attenuated
14
Figure 31: Proportion of various phenotypes in invadome barseq
spects; firstly as a source of data on individual genes with this
dataset available those investigating a candidate gene may simply be
able to refer to this dataset rather than generating their own knock-
out. As further parts of the invasion machinery are identified and
confirmed by new methods, such as BioID [166], I believe that many
of their mutant phenotypes will already lie in my data, allowing the
prioritisation of more important targets. Secondly I believe that anal-
ysis of this phenotyping together with the protein-protein interaction
network, as I will discuss in a later chapter, has the ability to reveal
new insights into parasite biology.
3.5 assessment of barseq phenotypes en-masse
As well as comparing phenotypes for individual mutants as above,
I also considered summary statistics for how many genes were tar-
getable, in comparison to previous screens. This might tell us both
something about the overall essentiality of genes that comprise the
invadome, and also about the types of data that barseq can supply.
3.5.0.1 Comparison to previous reverse genetic studies in P. berghei
I downloaded RMGMdb data for all 453 genes for which disruption
attempts had been reported in P. berghei. 59% of these attempts were
successful, as compared to 51% in my gene set.
Statistically this is a significant difference. While it might repre-
sent increased essentiality in the invadome (which one would predict
given its enrichment for blood-stage genes), it could also be the result
94 initial investigations: gene selection and P. BERGHEI screen
of publication-bias in favour of successful knock-outs, i.e. redundant
genes.
The rate of targetable genes in this putative invadome happens to
be identical to the previous barseq study of the kinome (51%), [79]
and very similar to a conventional screen of the phosphatome. [87] It
is possible that the targetability rate for the genome as a whole lies
in the vicinity of this figure. If so this would be a figure much higher
than the 18.2% reported for S. cerevisiae [73] or the 16% reported for E.
coli [71]. It seems likely that when genes preferentially expressed in
non-blood stages are analysed the figure will decrease, but it seems
unlikely to reach the level seen in these other systems, perhaps re-
flecting the complexity and number of vital events in even just the
intraerythrocytic cycle of parasite development.
3.5.0.2 Comparison to previous forward genetic study in P. falciparum
Perhaps the largest screen published to date in Plasmodium is the for-
ward genetic screen using piggyBac insertional mutagenesis in P. falci-
parum (Balu et al., [15]). In a forward genetic screen there are no nega-
tive results (until saturation is reached). This precludes a comparison
of targetability, but we can compare the different growth phenotypes
seen when parasites are successfully modfiied.
Of those mutants we classified viable, just 19% showed an attenu-
ated growth phenotype which we could detect. This is significantly
lower than that seen in the Balu et al. study in which around 50% of
clones analysed showed attenuation. I considered a number of possi-
ble reasons for this discrepancy.
Explanation 1: Barseq does not have the statistical power to de-
tect moderate attenuation
We can consider this possibility by ignoring the noise in
our data and assuming that are fitness measurements are
accurate (because though noise may change the results for
individual genes, the overall distribution should remain
the same). If we use a simple cut-off as in the Balu et al.
analysis we obtain a similar result, demonstrating that the
t-test is not the cause of the inconsistency. The cut-off for
attenuation vs. redundancy in the Balu et al. analysis was
~80%. If we use such a cut-off rather than a statistical test
3.5 assessment of barseq phenotypes en-masse 95
we find just 14% of our mutants to be attenuated—we are
not calling mutants Balu et al. found to be attenuated as
redundant.
Explanation 2: The forward-genetic screen includes insertions
outside genic regions
One could hypothesise these are more likely to produce
an intermediate attenuated phenotype, perhaps by simply
changing expression levels. Examining the Balu et. al data
reveals that the 33 insertions within CDSes (coding DNA
sequences) had a slightly lower attenuation rate (45%), but
this is still a significant difference from the data in our
approach
8
.
Explanation 3: Invasion genes have a more extreme fitness dis-
tribution than the genome as a whole:
This is entirely conceivable, but very difficult to test with-
out a larger reference dataset. We do know that the conven-
tional kinase and phosphatase screens presented a similar
phenotype distribution to the invadome.
Explanation 4: The distribution of phenotypes in P. berghei is
more extreme than in P. falciparum.
This is very possible, especially considering the differences
between a culture system and an in vivo model in which
the presence of an immune system may be less forgiving to
attenuated parasites, but the hypothesis is very difficult to
test without large comparable datasets for both organisms.
Explanation 5: Barseq is identifying severely attenuated mu-
tants as ’essential’
A fifth of genes identified in the Balu et al. screen were iden-
tified as ’severely attenuated’, meaning they led to growth
rates below 60% to those of wild-type parasites. We did
not identify any with such a low growth rate. Indeed our
experimental design probably did not allow us to. We re-
quired that to be considered viable, parasites had to be
present at ą 0.1% in the population on day 7 in at least
8 Hypergeometic distribution p ă 0.0004
96 initial investigations: gene selection and P. BERGHEI screen
2 mice. If vectors were to be integrated at roughly equal
rates, any given mutant might start off at 1.6% of the vi-
able population. From this starting point the parasite must
grow at at least 63% of the rate of a wild-type parasite if it
is to remain >0.1% on day 7.
I believe that a combination of explanations 4 and 5 is the likely cause
of the differences in distribution we see. A previous study attempted
to knock out 8 P. berghei genes which had reduced growth described
by Balu et al. in P. falciparum. [123] Half were entirely refractory to
deletion in P. berghei and the remainder gave normal growth (were
redundant); so differences in the experimental systems no doubt play
a role. However it is also important to acknowledge the insensitivity
of our system to very slow growing mutants, and we should accept
that we may be calling some of these as ’likely essential’.
The line in the sand between ’attenuated’ and ’essential’ is always
an artefact of the experimental system used. Any change that reduces
the overall parasite growth rate, for example by varying media or
varying a mouse’s diet, will shift the position of this line and so a
lower resolution here is not deeply concerning. We could detect more
attenuated mutants by removing the requirement that we detect bar-
codes until day 7, but this would be at the expense of increased num-
bers of false-positives. The optimum point to place this division will
depend on the aims of any particular experiment.
3.5.1 The picture today
As a result of the work described in this chapter we have trebled the
number of core invadome genes for which we can suggest a pheno-
type from 52 to 170, out of 320. A comprehensive picture of what
we now know about invadome targetability in P. berghei is shown in
Fig. 32. This can be compared to Fig. 20 in order to visualise the
progress described in this chapter. The genes for which phenotypes
have been newly described include 68 genes with the description
"conserved Plasmodium protein, unknown function". This first step to
assign 30 of these genes as likely essential prioritises them for further
study.
3.6 conclusions 97
Likely redundant Attenuated Likely essential
Not attempted
Prior (RMgmDB) Barseq
Figure 32: All putative growth phenotypes for core invadome genes in P. berghei,
including both barseq and previous work
Each icon represents a single gene. The left side represent previ-
ous reported RMGMdb targetability P. berghei [105], the right side
represents barseq phenotyping data.
3.6 conclusions
The PlasmoGEM-barseq methodology has proven powerful when ap-
plied to this set of putative invasion related genes. For the small pro-
portion of these genes which already had described growth pheno-
types, it has fundamentally agreed with previous data. But in a small
number of cases previously described growth phenotypes have been
challenged, and three of these results confirmed by subsequent inde-
pendent transfections.
These data are a useful starting point for further studies of invasion,
and a resource for the community one important piece of informa-
tion needed when looking at a new gene of interest is its knock-out
phenotype.
98 initial investigations: gene selection and P. BERGHEI screen
3.6.1 Possible follow-up work
We did not obtain any ‘severely-attenuated’ mutants with very slow
growth rates, in contrast to Balu et al. This might be in part due to
such mutants being outcompeted early in the infection, but might
also be a real result of an in vivo system. This results could be further
investigated by barseq conducting a transfection of only vectors tar-
geting ’likely essential’ genes and observing whether parasites came
up, and which barcodes they carried any newly identified non-
essential genes would have to be confirmed by independent trans-
fections to exclude the possibility of freak events which incorporate a
barcode without knocking out the target gene [79].
The PlasmoGEM project is forever moving forwards towards com-
plete coverage of the P. berghei genome. There are now a number of
additional invadome vectors available which will in time be analysed
by barseq to begin to fill in the parts of Fig. 32 which for now remain
grey.
Finally there are some limitations to using an in vivo system for
studying invasion by barcode sequencing. The development of an in
vitro approach might unlock new avenues for the application of Plas-
moGEM technology to invasion, as I will discuss in the next chapter.
4
T H E T R A N S F E R O F P L A S M O G E M T E C H N O L O G Y
TO P. K N O W L E S I
4.1 introduction
The previous chapter described the use of pooled-transfection bar-
code sequencing to assign growth phenotypes to mutants in a large
number of invasion related genes. The scale of these experiments
was made possible by the high efficiency of transfection possible in
P. berghei. However we also observed some of the limitations which
come with working in a rodent model.
The assay allowed me to measure the blood-stage fitness of para-
sites lacking each of these genes, and based on the PlasmoINT net-
work and their IDC placement we believe these genes to be involved
in erythrocyte invasion. But the barseq experiments did not provide
any direct evidence that the genes were involved in invasion rather
than, say, metabolism. Nor was I able to identify what stage, if any, of
that process they were involved in.
Many of P. berghei’s strengths as a model come from it being an in
vivo system, but there are also concomitant disadvantages. Because
of sequestration, P. berghei schizonts are not available without in vitro
schizont culture, and these schizonts do not rupture in vitro, render-
ing invasion relatively inaccessible. This clearly limits the number of
invasion-related experiments possible in P. berghei today.
In a model in which invasion was directly accessible, barseq might
be able to be applied directly to the invasion process. A pool of bar-
coded mutants could be used to produce schizonts, and the relative
ratio of different mutants assessed by barseq. The schizonts could be
allowed to rupture and the resulting merozoites purified and anal-
ysed by barseq. A decrease in barcode abundance as compared to the
schizonts would indicate an egress defect. One could progress further
merozoites could be allowed to invade erythrocytes, with barcodes
amplified from the resulting rings. Differences in abundances would
reflect differences in the ability to invade.
99
100 the transfer of plasmogem technology to P. KN OWLESI
Additionally, the barcoded invasion assay could take place into ery-
throcytes treated with enzymes or antibodies to deplete certain recep-
tors, and thus genes within the parasite linked to specific erythrocyte
components.
These approaches would probably be most powerful when applied
to attenuated mutants it would not be possible to establish the func-
tion of an essential gene and this brings us to another limitation
of the approach described in the previous chapter. As we saw, atten-
uated mutants rapidly decrease in the population and so the pooled
transfection approach probably excludes analysis of a sizeable propor-
tion of severely attenuated mutants (which could be ~20% of genes).
An in vitro system might mitigate this by allowing individual trans-
fections to be carried out for each gene without the use of large num-
bers of animals, allowing pooling just before an assay and so avoiding
the problem of attenuated mutants being outcompeted prior to anal-
ysis.
In short, there is much potential from establishing PlasmoGEM
technology in an in vitro system. PlasmoGEM uses linear transfec-
tion vectors, which have the advantage of minimising episomes, but
until the recent development of Crispr/Cas9 approaches linear DNA
had not been used successfully in the primary in vitro system, P. fal-
ciparum. In P. knowlesi, however, linear transfection has been reported
for a long time [109] and the parasite has recently been adapted for
growth in vitro in human red blood cells with high transfection effi-
ciencies reported. This seemed an ideal system, therefore, to attempt
to bring an in vitro approach to PlasmoGEM technology.
4.2 system development
4.2.1 Optimising culture conditions
The P. knowlesi A1-H.1 clone (a descendant of the H strain) was a gift
from Robert Moon at the National Institute for Medical Research and
he provided invaluable advice in the initial stages of my project. My
early work was characterised by difficulties in parasite culture. Over
time I optimised a number of aspects, leading to robust growth in my
hands.
4.2 system development 101
0%
25%
50%
75%
100%
No additional
serum
0.5% additional
Albumax
10% Addenbrookes
serum
10% Horse serum
10% Interstate Blood
Bank serum
Additional serum
Relative growth
Figure 33: The effect of different sera on P. knowlesi growth over 48 hours.
Parasitaemias were assessed with SYBR-green flow cytometry. Er-
ror bars reflect 95 % confidence-intervals. In each case the basal
media contains 0.5% Albumax.
I set up a number of assays to establish optimised conditions for
P. knowlesi culture. In each case a 0.5% parasitaemia culture of par-
asites was set up with triplicates for each condition. Parasites were
incubated for 48 hours, corresponding to approximately two intraery-
throcytic cycles.
After this time parasites were stained with SYBR green and the
final parasitaemia assayed with flow cytometry as described in Sec-
tion 2.2.8.
4.2.1.1 Horse serum permits stable growth of P. knowlesi
P. knowlesi is more demanding of serum than P. falciparum (where
10% serum or 0.5% Albumax are routinely used). The species has in
the past been grown in 20% Rhesus macaque serum, and was then
adapted to growth in 20% human serum. [109] During its adaptation
to human erythrocytes the line was grown in 0.5% Albumax II (w/v)
(which approximately corresponds to an initial 10% serum (v/v) ),
combined with an additional 10% human serum.
102 the transfer of plasmogem technology to P. KN OWLESI
My attempts to cultivate parasites in serum from haemochromato-
sis patients, which is used for routine P. falciparum culture in our
lab, were not successful—parasites grew slowly and often crashed. I
ordered commercial serum from Interstate Blood Bank
1
, which was
even less successful with parasitaemias crashing immediately when
parasites were transferred into it.
Ultimately I learnt that horse serum had been used quite success-
fully for P. knowlesi culture (R. Moon, personal communication). Al-
though I tried a number of strategies, including screening a num-
ber of different patient sera for growth, in my hands heat-inactivated
horse serum (Life Technologies) has been the most consistently succ-
cessful in culture. The results of a 48-hour growth assay confirming
these results is shown in Fig. 33.
It is worth noting that this assay may understate the effect of serum,
since its effects may take days to become apparent. Equally however,
a caveat is that the parasites used in this experiment had been grown
in horse serum for a number of months before this assay and may
have adapted to it.
Others have achieved culture in human serum that is as good or
better than that in horse serum, but this typically involves screening
many batches whereas I have yet to find a batch of horse serum which
is not effective. That equine serum can replace human serum was also
observed for P. falciparum back in 1979. [24]
Given these results, subsequent work described here was carried
out with parasites grown in 0.5% Albumax and an additional 10%
horse serum.
4.2.1.2 Malaria gas is necessary for optimal P. knowlesi culture
‘Physioxia’, the oxygen level in human tissue, is typically 1%-11%.
Given this, it is remarkable that most tissue culture of human cells
is successful at oxygen levels of 19.95%. [28] However Plasmodium
appears to be more sensitive to O
2
concentration, and the use of a
candle-jar was important for its initial adaptation to in vitro culture
[194]. Today malaria gas, a mixture of 93% N
2
, 4% CO
2
and 3% O
2
is more typically used. I sought to assess the importance of gassing
1 http://www.interstatebloodbank.com/
4.2 system development 103
**
*
0%
25%
50%
75%
100%
Air CO
2
Malaria Gas
Atmosphere
Relative growth
Figure 34: The effect of atmosphere on P. knowlesi growth over 48 hours.
Error bars reflect 95% confidence-intervals. In each case, media
contained 0.24% (w/v) sodium bicarbonate. (*: p ă 0.05, **: p ă
0.01)
cultures, and whether a CO
2
incubator could replace the need for
this.
In a 48-hour growth experiment, the results shown in Fig. 34, I
found that (at least in the absence of time to adapt) parasites grew
significantly more slowly in a CO
2
incubator as compared to sealed
flasks gassed with malaria gas. Nevertheless there was robust growth
in the CO
2
incubator.
Both these conditions were markedly better than air, though this
will have been in part an effect of the sodium bicarbonate in the
medium, which is intended to buffer 5% CO
2
.
In the light of this data I used a CO
2
incubator as a fall-back when
gas was not available and for very short periods of culture where
gassing was impractical. I have since maintained parasite culture for
more than a week in a CO
2
incubator when it was necessary to work
in a lab without malaria gas, showing that a low oxygen environment
is not essential.
4.2.1.3 Static culture appears superior to suspension culture for P. knowlesi
For P. falciparum, suspension culture achieved by gentle shaking is as-
sociated with faster parasite growth, lower multiplicity of infection
(MOI) and greater synchrony. [6] Especially significantly, this is re-
ported to markedly reduce the amount of time taken for modified
parasites to grow up after transfection. [6]
I hypothesised that shaking would also improve P. knowlesi culture.
But a 48-hour growth experiment (Fig. 35) actually demonstrated
104 the transfer of plasmogem technology to P. KN OWLESI
***
0%
25%
50%
75%
100%
Shaken Static
Type of culture
Relative growth
Figure 35: The effect of shaking vs. static culture on P. knowlesi growth over 48
hours.
Error bars reflect 95% confidence-intervals. ( ***: p ă 0.001)
static culture to be superior in terms of parasite growth with the
difference sufficient to overcome any improvement in MOI or syn-
chrony, even if this exists for P. knowlesi. This assay was repeated on
three separate occasions with results on each occasion showing static
culture to be superior.
We can speculate on the reason for such a result. The parasite was
adapted in static culture and so is likely to have optimised itself for
these conditions. However even P. falciparum lines which have long
been cultured in static conditions seem to benefit from shaking.
In static culture, the cells settle form a layer of compacted erythro-
cytes, both infected and uninfected. Shaking maintains blood and
medium in a churning suspension. Which of these is closer to the
in vivo case?
Qualitatively, the shaken medium appears much closer—blood is
itself a suspension of cells in serum. But the ratio of cells to fluid
in culture is very different in vitro and in vivo and this may have
important consequences.
Blood has a haematocrit of ~50%. In culture, without kidneys to
remove waste and a constant resupply of nutrients, we must lower
the haematocrit to ~2%. In suspension culture this means an infected
erythrocyte is likely on average to be 10.2 µm
2
from its nearest neigh-
bouring erythrocyte, as compared to 4.7 µm in real blood. Perhaps
this sparsity prevents as many merozoites contacting erythrocytes in
shaken cultures as compared to real blood.
2 r p
3
4πn
q
1
3
4.2 system development 105
But when erythrocytes settle in static culture they achieve a lo-
cal haematocrit of 100%, with each bursting schizont intimately sur-
rounded by erythrocytes which can be invaded. In this sense the static
culture is closer to the in vivo situation, and perhaps this explains P.
knowlesi’s preference for it.
Why the difference between P. falciparum and P. knowlesi then? Here
I enter the realm of speculation, but one possible explanation is their
different number of merozoites per schizont. If we model packed red
blood cells as packed spheres, each makes contact with 12 neighbours
around it in an optimal packing configuration.
That number, 12, is roughly equal to the number of merozoites in a
mature P. knowlesi schizont. One could imagine this being an optimal
figure if blood cells are packed ’one merozoite per erythrocte’. By
contrast P. falciparum schizonts can release up to 32 merozoites. That
could make multiple infections much more likely in a P. falciparum
static culture. The argument can also be made in reverse: perhaps the
larger number of merozoites makes up for the decreased likelihood
of each merozoite making contact with an erythrocyte in P. falciparum
shaken culture.
However many other possible explanations are possible, perhaps
involving a greater sensitivity to mechanical stresses in P. knowlesi.
Regardless, in light of this data I did not shake parasites for routine
culture, nor when bringing up parasites from a transfection.
Despite this result shaken cultures were helpful over short time
periods. My tests showed greater invasion to occur from purified sch-
izonts over 90 minutes with shaking than without (data not shown).
One explanation is that once placed on an incubator shelf large static
cultures do not actually form a full settled erythrocyte layer for hours,
and schizonts are likely to be the last to settle given their lower den-
sity. In these situations, where a suspension culture is inevitable in the
first hour regardless, shaking is likely to accelerate the rate at which
merozoites encounter erythocytes. Shaking also causes the culture to
reach thermal equilibrium with the incubator more quickly which is
important for late stage cultures.
Given these results I used shaking during the tight synchronisation
procedure, and immediately post-transfection. In these short-term sit-
uations haematocrits above 2% were used.
106 the transfer of plasmogem technology to P. KN OWLESI
4.2.2 P. knowlesi permits a high efficiency of transfection
For a rapid, quantifiable detection of the efficiency with which DNA
was taken up by P. knowlesi I employed a luciferase assay.
The plasmid pHLH-1 [49], which expresses firefly luciferase from
a P. falciparum hrp3 promoter, was obtained from MR4. DNA (40 µg)
was transfected into tightly synchronised mature P. knowlesi schizonts
using the Amaxa Nucleofector system as described in [142]. Schizonts
were resuspended in Lonza Nucleofector P3 solution pulsed with
programme FP-158 and added to fresh warm blood and media. Lu-
ciferase activity was assessed 18 hours later, as described in 2.2.9.
The results (Fig. 36) demonstrated a substantially higher transfec-
tion effiency than P. falciparum and as a side-effect also show the P.
falciparum hrp3 promoter to be functional in P. knowlesi.
C
on
t
r
ol
Tr
an
sf
e
c
t
ed
P
.
k
n
owle
si
Tr
an
sf
e
c
t
ed
P
.
f
al
c
iparum
1
10
100
1000
10000
100000
C u ltu re
C ounts per 10 second s
Figure 36: Luciferase assays demonstrate the high transfection efficiency of P.
knowlesi.
The P. falciparum result shown here is a typical result obtained
by others in the lab (courtesy of Zenon Zenonos) rather than the
result of my own experiment. Control represents an extract from
untransfected parasites, i.e. background instrument detection. Er-
ror bars represent 95% confidence intervals in luciferase detection
and confidence intervals for transfection itself might be larger.
4.2 system development 107
4.2.2.1 Luminescence does not increase linearly with DNA concentration
Plasmodium transfections typically involve quantities of DNA orders
of magnitude above the manufacturer’s directions for Lonza Nucleo-
fector kits. I sought to assess whether this was detrimental to transfec-
tion. To test this, I transfected the same luciferase construct into the
same batch of parasites, but at two different concentrations tenfold
apart. I found the larger amount of DNA to yield the most lumines-
cence but the magnitude of the increase was not proportional to the
increase in concentration. This suggests that saturation may be being
approached at 0.8 µg but that it has not been reached.
8μg
0.
8μg
0
500
1000
1500
2000
A m o u n t o f D N A
R e la tiv e lu m in e sc e n c e
Figure 37: Effect of DNA quantity on luminescence intensity in a luciferase assay.
Increasing the quantity of DNA transfected in a P. knowlesi trans-
fection from 0.8µ g to 8µg
4.2.2.2 Transfectant parasites are visible one day after P. knowlesi transfec-
tion
To complement the data from luciferase assays I also conducted a sch-
izont transfection with the episomal GFP plasmid described in [142].
16 hours post transfection parasites expressing cytosolic GFP could
be seen in the culture (Fig. 38). I was not able to achieve transfected
parasitaemias as high as the ~0.5% seen in [142] but was able to ob-
tain fluorescent parasitaemias greater than 0.01%, much higher than
for P. falciparum.
108 the transfer of plasmogem technology to P. KN OWLESI
BF GFP
Hoechst Merge
Figure 38: One of many parasites with cytosolic GFP seen one day post-
transfection.
4.2.2.3 Nucleofector solution affects parasite viability
I tested two different nucleofector solutions: nucleofector P1 and nu-
cleofector P3. I found vast decreases in viability using nucleofector P1
with very few parasites surviving transfection even prior to selection.
4.2.3 Mechanics of schizont transfection
There is a lack of information in the literature about what physically
occurs during a schizont transfection, but it seems important that
schizonts, and the merozoites they contain are at a very late, mature
stage for them to survive the process.
One potential explanation for this low viability is that electropora-
tion might lyse one or both of the parasite vacuole membrane (PVM)
and the erythrocyte plasma membrane, and that any schizonts that
do not already contain merozoites sufficiently mature for invasion
simply die.
However, this presents something of a puzzle as during normal de-
velopment secretion of the exonemes both begins the egress process,
and is also important for processing merozoite surface proteins to
prime them for invasion.
In order to better understand what actually happens in the cuvette
of schizonts during a schizont transfection, I prepared smears from
4.2 system development 109
cultures immediately before and immediately after transfection of
schizonts in solution nucleofector P3 using programme FP-158 in the
Lonza nucleofector.
(a) Schizont purification immediately
prior to transfection
(b) Culture immediately after transfec-
tion, with apparent lysis of the par-
asitophorous vacuole, and in some
cases the red blood cell.
Figure 39: Giemsa smears before and after electroporation with Lonza nu-
cleofector FP-158. Representative cells are enlarged.
Before transfection in typical P. knowlesi fashion the merozoites are
tightly clustered, radiating from the food vacuole. Afterwards most
red blood cell membranes are intact, although understanding the pro-
portions is hampered by the fact that centrifugation may have prefer-
entially pelleted intact cells. However in contrast to the tightly organ-
ised appearance before electroporation, merozoites are distributed
throughout the cell. The appearance suggests to me that electropo-
ration causes the PVM to break down.
There are two ways in which one might imagine this happening:
The electric shock could directly cause disruption of the PVM
The pores created by the electric pulse might allow Ca
2`
to
enter the parasite cytosol from intracellular stores (such as the
endoplasmic reticulum), inducing egress in much the same way
as calcium ionophore. [77]
In my hands, post-transfection parasitaemias achieved are at best
10% of those expected if the same number of schizonts was added
to fresh blood without electroporation. Understanding the source of
this attrition is likely to be an important component of increasing
P. knowlesi transfection efficiency in future. If the PVM breaks down
before the egress cascade is properly initiated this might result in the
release of merozoites which are non-invasive.
110 the transfer of plasmogem technology to P. KN OWLESI
I was not able to proceed further with this analysis but I think that a
full understanding of the mechanics of transfection may be helpful for
further increases in transfection efficiency; this might be achieved by
using a GFP marker to identify the PVM with analysis by microscopy
immediately post transfection.
If it were found that no ordered egress occurs after a schizont trans-
fection there might be value in using a transfection buffer which in-
duces lysis of the erythrocyte membrane and PVM in order to leave
fewer membranes between the exogenous DNA and the parasite nu-
cleus.
4.2.4 Characterisation of the response of P. knowlesi to selective agents
One pre-requisite for a genetic system system is the availability of
drugs allowing those parasites which have taken up exogenous DNA
to be selected from the background population of wild-type para-
sites. Previously WR99210, G418 and blasticidin have been used for
selection in P. knowlesi, in the context of in vitro culture in macaque
cells. [199] WR99210 was also used in selection in the A1.H1 clone
in human cells. [142] Because drug-sensitivity is often strain specific,
I decided to determine kill curves for 3 drugs in the A1.H1 clone:
pyrimethamine, G418 and DSM1. Pyrimethamine was selected be-
cause of unpublished reports of a large difference in IC
50
between
wild-type and transformed parasites, G418 because it has the highest
increase in IC
50
published in P. knowlesi [199] and DSM1 because it
was already in use in a number of P. falciparum constructs in the lab
and its use would simplify the adaptation of these constructs to P.
knowlesi.
Dilution series for the drugs were set up in 96-well plates in com-
plete media, seeded with parasites and fresh erythrocytes and incu-
bated for 48 hours. Final parasitaemias were assessed by flow cytom-
etry. The results are shown in Fig. 40. I used a 4-parameter logistic
regression to fit a sigmoidal curve to the data, and interpolated IC
50
values which are shown in Table 6.
4.2 system development 111
drug target ic
50
95% CI
Pyrimethamine Dihydrofolate reduc-
tase
3 nM 2.8 4.2
G418 Protein synthesis 377 µM 248 573
DSM1 Dihydroorate dehydro-
genase
1.5 µM 0.57 4.24
Table 6: Calculated IC
50
s in P. knowlesi for pyrimethamine, G418 and DSM1
drugs
IC
50
s and their confidence intervals, as calculated from the growth
inhibition curves shown in Fig. 40.
4.2.4.1 Pyrimethamine
I calculated an IC
50
of 3 nM for pyrimethamine, considerably lower
than that in P. falciparum (~40 nM). [161] The IC
50
for the activity
of the PfDHFR enzyme is reported at 42 nM, [145] whereas that for
PkDHFR was reported as 1 nM ([25]), a ratio consistent with my re-
sults. I subsequently identified early P. knowlesi literature from 1971
suggesting an IC
50
close to 1 nM. [86] This increased sensitivity has
important implications for the selective pressure to be used.
4.2.4.2 G418
The IC
50
I calculated for G418 (geneticin) was 377 µM, within the
range seen for sensitive P. falciparum isolates (3D7 is one of the most
sensitive with an IC
50
of 300 uM). [135] An IC
50
of 600 mM (three
orders of magnitude higher) was reported in P. knowlesi, [203] but
from the other figures described in this publication I believe this was
a copy-editing error and that the authors intended 600 µM. In sum,
it appears that use of the neoR marker might be effective in the P.
knowlesi A1.H1 clone.
4.2.4.3 DSM1
The most surprising result was for DSM1. This drug has an IC
50
of
56 nM in P. falciparum. I detected P. knowlesi as 27ˆ less sensitive to
it. Given the large difference seen, and the relative large confidence
interval, I cultured parasites at 500 nM and 10 µM and after 72 hours
observed complete inhibition in the case of 10 µM and normal growth
for 500 nM. I then checked the drug was still functional by observing
112 the transfer of plasmogem technology to P. KN OWLESI
that it killed P. falciparum at 500 nM. This degree of insensitivity (in
P. falciparum the selective marker yDHODH provides an IC
50
of only
4.6 µM) may rule out the use of this drug as a selectable marker in P.
knowlesi.
I performed a Clustal alignment of the P. falciparum 3D7 and P.
knowlesi H strain sequences, and found that, while the P. knowlesi
genes is well conserved with P. falciparum, there are a number of dif-
ferences, including a C175L substitution. This Cys
175
is within 4Åof
the site of DSM1 binding, [50] and so might well play a role in the
>20-fold reduced sensitivity to DSM1.
These experiments are, to my knowledge, the first use of DSM1 in
P. knowlesi and though they reveal it to be effective at killing wild-type
parasites at high concentrations, they suggest a decreased sensitivity
as compared to P. falciparum which may limit the drug’s usefulness
for genetic studies.
4.2.5 Genomic analysis
In the course of this project I performed whole-genome Illumina se-
quencing the genomes of various parasite populations derived from
the H1.A1 clone. I will discuss the rationale, and results, of this se-
quencing in a locus specific manner later. But I will here provide
some overall observations on the genome of my isolate of the A1.H1
clone (which has undergone more than 6 months of culture in my
hands, and so may not reflect that published in [142]).
I am comparing to the reference H-strain, [151] which since it was
isolated had been maintained exclusively in a cynomologous context.
The A1.H1 line does not produce gametocytes. In lab-adapted P.
berghei and P. falciparum and strains which lose the ability to produce
gametocytes, this is often the result of a mutation in AP2-G [179]. I
therefore looked for such a mutation but none was apparent. How-
ever a scan of other ApiAP2 genes did reveal a SNP which changed
a Cys to a premature stop codon in PKNH_1441500. This would be
expected to prevent the translation of the final 3.5 kb of this 8.6 kb
gene. Since ApiAP2 genes appear to be some of the master regulators
of Plasmodium transcription this may well be a functional change
but whether it affects gametocytogenesis remains to be proven.
4.2 system development 113
C o ncentration (μ M)
Fin a l p a ra sita e m ia (%)
10
-
4
10
-
2
10
10
2
10
4
0
2
4
6
Pyrimethamine
G418
(a) Growth inhibition curve for wild-type P. knowlesi parasites
placed under increasing concentrations of pyrimethamine and
G418
D sm 1 co ncentration
(μM)
Fin a l p a ra sita e m ia (%)
0
1
2
3
4
5
10
-1
10
0
10
1
10
2
10
3
Control
(b) Growth inhibition curve for wild-type P. knowlesi parasites
placed under increasing concentrations of DSM1
Figure 40: P. knowlesi is sensitive to DSM1, pyrimethamine and G418, but with
IC
50
values spanning four orders of magnitude.
Parasitaemias were calculated by SYBR-green flow-cytometry
from parasites grown in serial dilutions of each drug.
114 the transfer of plasmogem technology to P. KN OWLESI
Cys175
DSM1
Figure 41: Crystal structure of P. falciparum DHODH bound by DSM1, highlight-
ing a residue that differs in P. knowlesi
I downloaded the crystal structure from PDB ID 3I65 [50], and vi-
sualised it using Cn3D [202]. I have highlighted in yellow Cys
175
which is in close contact with DSM1 (blue). In P. knowlesi this
residue is substituted by a hydrophobic leucine which may play
a role in increasing the IC
50
for DSM1 in this species.
In addition, I noted the presence of a 450 kb segmental duplication
in chromosome 14 (Fig. 43). This contains 99 genes, including the
invasion-related gene TRAMP. Recently reports in another human-
adapted P. knowlesi line have identified duplications of invasion lig-
ands as a potential mechanism of adaptation, [121] meaning that the
99 genes in this region might well be of some interest.
It is important to bear this duplication in mind when designing
gene targeting experiments since replacement of any gene within it
with a selectable marker will still leave one copy of the wild-type
gene.
4.3 preparation of plasmogem vectors
4.3.1 Cloning of Gateway cassette
The protocol for generating PlasmoGEM targeting constructs involves
replacing the region to be targeted with a Zeo/PheS negative selec-
tion cassette using recombineering and then a final Gateway reaction
to change this bacterial selection cassette for one which is functional
in Plasmodium. (Fig. 13) My first action in adapting the system was to
construct a Gateway cassette that would function in P. knowlesi.
4.3 preparation of plasmogem vectors 115
Figure 42: Premature stop codon in an ApiAP2 gene in A1.H1 isolates
I amplified the region upstream of the P. knowlesi hsp70 gene adding
overhangs to insert it after the Pb 3’UTR of the Pb3HA-dhfr-yfcu vec-
tor and before the hDHFR/yfcu fusion gene. The hDHFR/yfcu gene
was itself amplified, as was the 3 UTR of PkHsp70. These three frag-
ments were assembled in a Gibson Assembly reaction and cloned into
the vector cut at the PstI and KpnI sites. The assembled product was
transformed into PIR2 cells which are permissive for the R6Kγ origin
of replication.
4.3.2 Assessing the pilot pJAZZ library
As I began this project a small library had been produced containing
sections of P. knowlesi genomic DNA in the pJAZZ backbone [78].
This was carried out by Burcu Anar, and clones were mapped to the
P. knowlesi genome by Frank Schwach.
I examined the location of clones on the genome and found that
they covered 876 genes. When I added a requirement that the clone
contain at least 1000 bp upstream of the gene, to act as a homology
arm, this was reduced to 687 genes, just over 12% of the genome. I
performed a simulation to examine how coverage would be increased
by library expansion, shown in Fig. 44. This shows that should exper-
imental efforts be successful with the limited P. knowlesi library, it
should be possible to expand it to be able to target the majority of the
genome with ~10,000 mapped clones but that 100% coverage will
never be cost-effective.
116 the transfer of plasmogem technology to P. KN OWLESI
1x
2x
Coverage
Genomic coordinates
Figure 43: Segmental duplication in chromosome 14
This segmental duplication is present in all of my isolates of the
H1.A1 clone, with some implications for gene targeting in this
region.
Limited to this pilot library, I identified those genes covered within
it which were part of the core invadome, as defined in Chapter 3. The
PlasmoGEM team attempted to manufacture 24 knock-out constructs
and 24 tagging constructs for these invasion genes. There is a degree
of attrition in the production process and this first run had particular
problems with carryover of wild-type clones. Therefore only 14 clones
passed the final QC process. Additionally these clones were produced
using an earlier version of the GW cassette designed to use G418 as a
resistance marker
3
.
I took intermediate vectors made by the recombineering team which
had given rise to successfully screened vectors, and used a Gateway
reaction to insert the newly constructed P. knowlesi dhfr/yfcu Gate-
way cassette. I screened the reactions and identified 9 correct target-
ing vectors, shown in Table 7.
4.4 plasmodium knowlesi gene targeting
All of these constructs were used in pilot P. knowlesi gene targeting
experiments. Whether a construct was for tagging or knock-out was
primarily the result of attrition during the production process rather
than a planned selection.
3 I cloned this cassette and attempted transfections which were not successful, but the
cause of these failures could not be concluded and so those experiments will not be
described in detail
4.4 plasmodium knowlesi gene targeting 117
0
500
1000
1500
2000
2500 5000 7500
Insert size / bp
Simulated distribution
(a) Hypothetical size distribution for frag-
ments sheared to 5kb’, assuming a nor-
mal distribution and the ranges seen in
the manufacturer’s shearing protocol.
The shearing may not in fact be nor-
mal, with positive skew likely, but the
decreased transformation efficiency of
large-inserts is likely to counteract this,
and so I approximate to a normal dis-
tribution.
(b) As the number of simulated clones in-
creases, coverage increases but the rate
of increase steadily drops (due to mul-
tiple coverage). The red line shows the
amount of the genome expected to be
covered, whilst the blue line predicts
the proportion for which a target cas-
sette will be able to be produced.
0%
25%
50%
75%
100%
5000 10000 15000 20000
Mapped clones
Predicted coverage
Basepair covered
Covered with 1kb flanking region
Figure 44: Possibilities for library expansion
4.4.1 Genes targeted for HA-tagging
Three proteins were targeted for HA-tagging:
apical sushi protein was initially described as localising to
the micronemes but is now known to reside in the rhoptry-neck.
[183] Its function is not known.
rop14 is a rhoptry-localised protein. Its exact function is not
known but it appears to be one of the last proteins to leave from
the rhoptries, its secretion probably taking place after invasion.
[214]
ribonuclease h2 subunit a expression peaks at 46 hours
post-invasion in P. falciparum. Although it forms part of the Hu
et al. invadome it seems likely to be a false positive, since in
other systems it is suggested that it may be involved in DNA
replication.
4.4.2 Genes targeted for knock-out
Six constructs were successfully produced for knocking out P. knowlesi
genes. I will briefly review the proposed functions of the genes these
118 the transfer of plasmogem technology to P. KN OWLESI
gene id description construct
ha tagging constructs
PKNH_0304000 Apical sushi protein TGL58
PKNH_1126300 Ribonuclease H2 subunit A TGL60
PKNH_1136700 ROP14 TGL61
knock-out constructs
PKNH_0406200 Ser/Thr protein kinase TGL59
PKNH_0918700 DHHC3 TGL62
PKNH_0941100 DnaJ protein TGL63
PKNH_1207600 Cons. Plasm. protein, u.f. TGL64
PKNH_1306400 Rhodanese like protein, put. TGL65
PKNH_0423200 Protein kinase G TGL66
Table 7: Targeting constructs available for initial P. knowlesi PlasmoGEM experi-
ments
constructs are intended to knock out, and any expected mutant phe-
notypes.
4.4.2.1 Target 1: A type IV J protein
PKNH_0941100 encodes a DnaJ protein which is conserved in Plas-
modium but lacks orthologues in other Apicomplexa despite the pres-
ence of orthologs in metazoans, as shown in Fig. 45. Transcription of
its P. falciparum orthologue, PF3D7_1143200
4
, peaks at 33 hours post
invasion and despite initial suggestions of an apicoplast localisation
[21], proteomic data suggested it to be present in the merozoite. [89]
The Hsp40 family, of which this protein is a member, is expanded
in P. falciparum with 43 members and this expansion is hypothesised
to be linked to the parasitic lifestyle and the need to invade and re-
model erythrocytes. [21] It is divided into subfamilies I, II, III and
IV. PF3D7_1143200 falls into the type IV family. 11 of 12 members of
this subfamily have PEXEL export sequences. PF3D7_1143200 is the
solitary exception which does not.
PF3D7_1143200 was confirmed to be an integral membrane protein
which was expressed from 37 hours post invasion until egress and
was absent from rings. [89] Its localisation is initially reminiscent of
4 A PhD thesis [89] provides by far the best review of what is known about
PF3D7_1143200
4.4 plasmodium knowlesi gene targeting 119
Arabidopsis lyrata subsp. lyrata
Daphnia pulex
Tetrahymena thermophila
Ichthyophthirius multifiliis
Plasmodium fragile
Plasmodium inui
Plasmodium vivax
Plasmodium cynomolgi
Plasmodium knowlesi strain H
Plasmodium yoelii
Plasmodium berghei strain An a
Plasmodium berghei strain An b
Plasmodium vinckei petteri
Plasmodium chabaudi chabaudi
Plasmodium reichenowi
Plasmodium falciparum isolate Camp / Malaysia
Plasmodium falciparum Santa Lucia
Plasmodium falciparum isolate NF54
Plasmodium falciparum Vietnam Oak-Knoll FVO
Plasmodium falciparum Tanzania 2000708
1
0.9
0.67
1
0.69
0.85
0.99
1
Figure 45: Phylogenetic analysis of sequences homologous to PKNH_0941100
A BLAST search was carried out against the UniProt database.
The closest matching sequences were aligned with Muscle, cu-
rated with Gblocks and phylogeny computed with PhyML. Boot-
strap values were calulated with a aLRT test. [35, 85, 9, 30, 60, 52]
Each Plasmodium species has a single copy of the gene. The near-
est outgroup includes 2 members of the Alveolata, but also a
metazoan and a plant. Other apicomplexans lack a similar gene
entirely, suggesting that it plays a specific and important role
across Plasmodium.
the endoplasmic reticulum but becomes apical to the nuclei of mero-
zoites in segmented schizonts, localising either to the micronemes or
the rhoptries. [89] The protein is not detected in culture supernatant,
nor on the surface of free merozoites. [89]
In P. falciparum, a single-crossover C-terminal FLAG tag of the gene
was created. Subsequent immunoprecipitation identified, among other
proteins, rhoptry neck proteins RON3, RON4 and RON5 as possible
interactants. [89]
In contrast, attempts to knock the gene out with a double crossover
approach in P. falciparum were not successful despite three attempts.
Some parasites were recovered in which the 3 UTR was ablated, but
these still showed normal protein expression levels. This led to the
suggestion that the protein was likely essential in P. falciparum. [89]
In parallel the orthologous gene (PBANKA_0905800) was included
in a P. berghei screen for proteins involved in cytoadherence (which
would reflect a non-invasive function). It was included in that screen
because some corresponding peptides were found in proteomic exper-
120 the transfer of plasmogem technology to P. KN OWLESI
iments designed to detect proteins presented on the erythrocyte mem-
brane of schizonts. [64] Two methods were used to identify erythro-
cyte membrane proteins: hypotonic lysis and surface shaving. The
authors were concerned about the possibility of contamination of the
surface-shaving peptides with merozoite membrane proteins and so
they subtracted any proteins previously found in a merozoite pro-
teome. PBANKA_0905800 peptide hits were identified in the hypo-
tonic lysis set (in which merozoite contamination was not expected),
and not in the surface shaving set, nor the merozoite proteome. This
might suggest a different role for this gene in P. berghei, but could
equally be the result of noise (MTRAP, a known invasion gene, had
the same number of peptide matches as PBANKA_0905800 in the
hypotonic lysis proteome). In this experiment mutants could be gen-
erated but were not cloned by limiting dilution. [64]. No further anal-
ysis was conducted.
4.4.2.2 Target 2: DHHC3
The palmitoyl-transferase DHHC3 appears to be localised to the IMC
in P. berghei, which would be consistent with a role in invasion. [65]
However in P. falciparum the orthologous protein localises to the Golgi.
[188]
In P. berghei DHHC3 could be deleted, with integration of the marker
detected on a Southern blot, but the parasite was not cloned nor phe-
notyped in detail. [65] However an attempt to knock-out the same
gene in P. falciparum with a double-crossover approach was not suc-
cessful parasites were recovered which carried episomes and inte-
gration could not be forced despite drug cycling and the application
of negative selection with 5-fluorocytosine. [188]
4.4.2.3 Target 3: PKNH_1207600
PKNH_1207600 is a typical "conserved Plasmodium protein, unknown
function" in that we know almost nothing about it. The sequences
from species in the simian clade are predicted to have a transmem-
brane domain and a non-cytoplasmic C-terminus by Phobius (but not
TMHMM). The only orthologs outside Plasmodium on OrthoMCLDB
is a highly divergent Dictyostelium discoideum, there are no orthologs
in the apicomplexa outside Plasmodium (Fig. 46).
4.4 plasmodium knowlesi gene targeting 121
P. knowlesi
P. chabaudi
P. berghei
P. reichenowi
P. falciparum
P. yoelii
P. vivax
P. cynomolgi
Cytoplasmic
TM
Non-cytoplasmic
Figure 46: Phylogenetic analysis of PKNH_1207600 orthologs in Plasmodium
Protein sequences were retrieved from PlasmoDB. Sequences
were aligned with Muscle, curated with Gblocks and phylogeny
computed with PhyML. [35, 85, 9, 30, 60, 52] Phobius was used
to predict trans-membrane domains. One was identified in the
simian clade only.
4.4.2.4 Target 4: PKNH_1306400
PKNH_1306400 is a rhodanese-like protein. Rhodanese is found in
the mitochondria where it detoxifies cyanide by acting as a sulphur-
transferase, but the domain is also found in other classes of protein in-
cluding phosphatases and ubiquitin C-terminal hydrolases. The gene
is 1:1 orthologous across all Plasmodium genomes. PKNH_1306400
does have a T. gondii homolog, but is not found in Babesia and Theileria.
There is very little additional detail available about this protein.
4.4.2.5 Target 5: A serine/threonine protein kinase
PKNH_0406200 is a putative serine/threonine protein kinase. As ki-
nases have been a focal point of previous reverse genetic research
we have some data from orthologs. The orthologous P. berghei gene,
PBANKA_0311400 has been described as essential in three experi-
ments. In the P. berghei conventional kinome screen the gene could
not be knocked out despite five attempts. Then in the P. berghei ki-
nome barseq screen too it was described as likely essential, [80] and
122 the transfer of plasmogem technology to P. KN OWLESI
P. vivax
P. cynomolgi
P. knowlesi
P. chaubaudi
P. berghei
P. yoelii 17XNL
P. yoelii 17X
P. yoelii YM
P. falciparum 3D7
P. falciparum IT
P. reichenowi
T. gondii
0.81
0.99
1
0.93
Figure 47: Phylogenetic analysis of DHHC3 orthologs in Plasmodium and T.
gondii
Protein sequences were retrieved from PlasmoDB, and the T.
gondii DHHC13 sequence added. Sequences were aligned with
Muscle, curated with Gblocks and phylogeny computed with
PhyML. Bootstrap values were calulated with a aLRT test. [35, 85,
9, 30, 60, 52] The tree is that one would predict from the evolution
of the parasite species shown in Fig. 4.
was described thus again in the invadome barseq experiment in the
previous chapter.
In P. falciparum the picture is not so clear. A kinase screen described
the gene as likely non-essential. However this was on the basis of an
intense signal obtained in the integration PCR assay, using primers
either upstream or downstream of the target gene. [181] Successfully
targeted clones could not be recovered by limiting dilution cloning.
This experiment clearly proves that integration is occurring but not
that the integrated parasites are indefinitely viable.
4.4.2.6 Target 6: cGMP-dependent protein kinase (PKG)
PKG is a well-characterised essential protein kinase, with roles in
egress and invasion established in P. falciparum and P. berghei. [4] I
chose to include it as a negative control, to investigate the result of
targeting essential genes.
4.5 transfection of plasmogem vectors
Because P. knowlesi is not a widely used genetic system, there was
no direct data on whether any of the modifications for which the
constructs in Table 7 were designed would be tolerated. Seeking a
positive control, I decided that the vectors most likely to be tolerated
were the three HA-tagging constructs. These were transfected into
4.5 transfection of plasmogem vectors 123
LadderrTGL58 trans.
4 kb
3 kb
2.5 kb
2 kb
(a) First positive integra-
tion PCR band from
dying parasites ini-
tially transfected with
TGL60.
(b) Subsequent positive
integration PCR from
second round of
transfections.
1kb-
2kb-
3kb-
Trans WT Trans WT
PKNH_
1207600
PKNH_
0304000
Figure 48: Positive integration PCR results
parasites as described in Section 2.2.6 and selected from 1 day after
transfection with 200 nM pyrimethamine.
Parasitaemia dropped rapidly upon selection, but two weeks after
transfection I could see a barely visible parasitaemia in the PKNH_1126300
transfection. I decided to isolate DNA from these parasites and per-
form an integration PCR. After 35 cycles of PCR a band was visible
at the expected size (Fig. 48a).
Despite this, parasites did not come up after 35 days in any of the
transfections.
My interpretation of these results was that correctly targeted par-
asites were being formed but were for some reason not surviving. I
made two changes for all subsequent experiments:
Drug concentration - I dropped the pyrimethamine concetration
to 100 nM, which my previous characterisation (page 113) sug-
gested would still be selective.
Erythrocyte replacement - I replaced at least 50% of the erythro-
cytes each week to ensure there were always young erythrocytes
present for optimal invasion.
I used this technique to transfect all nine constructs shown in Ta-
ble 7. Selection was applied one day after transfection with 100 nM
pyrimethamine. The first parasites were seen on day 9, and parasites
were up in all nine transfections by day 14.
While the presence of drug-resistant parasites was encouraging, it
was alarming that the knock-out vector for PKG gave rise to viable
parasites when transfected. This vector had been included as a nega-
tive control, and I did not expect to see viable parasites if the locus
124 the transfer of plasmogem technology to P. KN OWLESI
was successfully modified. Since I had transfected linear DNA I had
not expected any form of episomal carriage. Thus it was essential to
genotype the parasites.
4.5.1 Genotyping
GOI
Dhfr selection cassette
Dhfr selection cassette
GW1
WT1
5’ int
GW2
WT2
dhfr1
dhfr2
3’ int
Wild-type genome
pJAZZ vector
Modified locus
barcode
barcode
(a) Generic map for knock-out vectors.
QCR1
QCR1
QCR1
5’ int
QCR2
QCR2
dhfr1
dhfr2
3’ int
GOI
GOI
GOI
Dhfr cassette3xHA
QCR2
dhfr1
dhfr2
Dhfr cassette3xHA
Wild-type genome
pJAZZ vector
Modified locus
barcode
barcode
(b) Generic map for HA-tagging vectors.
Figure 49: Maps of double crossover integration strategies using Plasmo-
GEM vectors both for knock-outs and tagging in P. knowlesi with
the primers used for genotyping by PCR and qPCR shown.
A range of genotyping techniques were used to consider the possi-
ble events that had given rise to drug-resistant parasites.
4.5.1.1 Genotyping by single barcode sequencing verifies transfection of
correct constructs
One possible explanation for seeing parasites in all wells is cross-
contamination. I therefore performed testing to exclude any possibil-
ity I had accidentally transferred parasites from a viable transfection
into the PKG well, or contaminated the DNA prior to transfection. As
discussed in 1.9.3, PlasmoGEM vectors carry barcodes between con-
served primer binding sites primarily for the purposes of barseq on
the Illumina platform. However I was also able to use this feature to
rule out cross-contamination I extracted genomic DNA from par-
4.5 transfection of plasmogem vectors 125
asites and amplified a region containing the barcode (as described
in Section 2.2.12), purified the PCR product and sent it for capillary
sequencing by GATC Biotech.
I streamlined this technique by writing a script which searches
a capillary sequence for barcode primers, extracts the barcode and
queries a table to report the relevant gene to which that barcode cor-
responds.
In all nine cases the correct barcode was detected—albeit in one
case with a single nucleotide error (probably a sequencing artefact)—
giving confidence that parasites were resistant due to transfection
with the correct vector.
4.5.2 End-point PCR demonstrates that linear PlasmoGEM vectors are
able to integrate into the P. knowlesi genome
I screened the isolated genomic DNA by end-point PCR using primers
in the genome outside the homology arms of the vectors, Fig. 49 and
without further opimisation was able to obtain positive integration
PCRs for PKNH_0304000 and PKNH_1207600. Integration PCRs can
fail on pJAZZ vectors due to the long-range of amplification necessi-
tated by the long homology arms, which can be exacerbated by Plas-
modium’s (A+T)-rich stretches.
However for all nine transfections PCR to detect the wild-type un-
modified locus also produced a bright amplicon band on an ethidium-
bromide stained gel, suggesting that in no case was there a clonal pop-
ulation of modified parasites. It was therefore necessary to use other
approaches to identify which PCR result was most representative of
the population.
4.5.3 Quantitative PCR indicates the proportion of parasites carrying ge-
nomic alterations
I had two concerns about integration PCR as a way of assessing my
ability to alter the parasite genome. The first was the possibility of
false negative results, since the use of pJAZZ vectors with long ho-
mology arms necessitates amplifying large regions to reach part of
the genome which was not included in the initial transfection vector.
126 the transfer of plasmogem technology to P. KN OWLESI
It is possible that these amplicons will fail even if the integrated locus
is present.
The second concern was false-positives due to the non-quantitative
nature of an integration PCR. Theoretically a single parasite contain-
ing an altered genome could be enough to allow the many cycles of
a PCR to produce a band demonstrating integration, and while this
is a true result, it is not indicative of the population and will yield a
disappointing result upon dilution cloning. Furthermore the parasite
containing the integration might even be dead, or dying, due to the
modification. A final concern is that an integration PCR can occasion-
ally show a true integration event, but one which has only been made
possible by the parasite duplicating the region containing the target
gene, [79] so that there has been a real modification of the genome
but the result is not the knock-out of the target gene. This latter con-
cern also applies to pulsed-field gel electrophoresis, which confirms
merely the chromosome which has been targeted, not the nature of
the event on this chromosome.
I hypothesised that all of these issues would be solved by using
quantitative PCR (qPCR) to amplify a region expected to be removed
in successfully modified parasites.
In the case of knock-out attempts primers were designed within the
knocked out gene using PRIMER3. In the case of tags the PlasmoGEM
QCR1 and QCR2 primers were used. These bind just upstream of the
tag, and just downstream of the stop codon, thus producing a small
amplicon from wild-type locus. They could in theory also produce an
amplicon from the tagged locus but it would be >5kb. Provided short
extension times are used, this amplicon is not generated.
The abundance of a sequence as detected by qPCR is a factor of
three variables:
1. The parasitaemia of the sample
2. The efficiency of DNA extraction
3. The proportion of parasites which still carry the target locus
I used the well-characterised primers Plasmo1 and Plasmo2 [165] to
establish the baseline amount of genomic DNA present in the sample,
controlling for factors 1 and 2.
4.5 transfection of plasmogem vectors 127
P a ra site lin e
R ela tiv e a b un da n ce
58
59
60
61
62
63
64
65
66
0.0
0.5
1.0
1.5
2.0
2.5
Plasmo1, Plasmo2
63 - PKNH_0941100
62 - DHHC3
Figure 50: qPCR shows successful disruption of the PKNH_0941100 gene
Relative abundance was calculated by taking 2
´C
t
, and nor-
malised for each primer pairto the TGL58 transfection. There
are three primer sets shown here. Plasmo1/Plasmo2 are well-
characterised primers which detect DNA from any parasite
genome. The two amplicons 62 and 63 are located within DHHC3
and PKNH_0941100 respectively. We see different samples have
different abundances for the Plasmo1/Plasmo2 amplicon, repre-
senting parasitaemia and the efficiency of DNA extraction. The
two regions of interest are highlighted in boxes. TGL62 has levels
similar to other amplicons. This indicates that although resistant
parasites have appeared the transfection attempt has not success-
fully produced an integrated population. In contrast for TGL63,
wild-type amplicon levels are very much lower than would be
expected from the Plasmo1/Plasmo2 primers and so we can in-
fer that the knock-out has succeeded in the vast majority of the
parasite population.
In initial experiments to establish the effectiveness of this qPCR
method I used each pair of primers on every sample. The results for
three primer pairs are shown in Fig. 50.
The results show that amplicons co-vary between samples, repre-
senting different parasitaemias and efficiencies of extraction. All three
primer pairs covary for DHHC3 indicating that most parasites retain
the locus, but for the attempt to knock-out PKNH_0941100 the am-
plicon located within the target is present at a far lower level than
the other primer pairs, showing that most parasites have this region
ablated.
128 the transfer of plasmogem technology to P. KN OWLESI
The Plasmo1/Plasmo2 primers can be used to calculate a normalised
copy number for each gene. I used this to calculate the copy-numbers
of all 9 genes.
Tra n sfe c ta n t
C o py num b er
58B -03
04000
59B - 0406200
60B - 1126300
61B - 1136700
62B - 0918700
63B - 0941100
64B - 1207600
65B - 1306400
66B - 0423200
0.0
0.5
1.0
1.5
59
63
60
58
Figure 51: qPCR results reveal four modifications which are dominant in their pop-
ulations after transfection. Amplicons for TGL58, TGL60, TGL61,
TGL63 are very much lower in abundance in their corresponding
transfection than other controls, indicating four successful modi-
fications: three tags and one knock-out.
Four showed successful modification (Fig. 51). Five showed wild-
type amplicons still present at the wild-type level. These results are
summarised in Table 8.
gene id description construct result
ha tagging constructs
PKNH_0304000 Apical sushi protein TGL58 Success
PKNH_1126300 Ribonuclease H2 subunit A TGL60 Success
PKNH_1136700 ROP14 TGL61 Success
knock-out constructs
PKNH_0406200 Ser/Thr protein kinase TGL59 Failure
PKNH_0918700 DHHC3 TGL62 Failure
PKNH_0941100 DnaJ protein TGL63 Sucess
PKNH_1207600 Cons. Plasm. protein, u.f. TGL64 Failure
PKNH_1306400 Rhodanese like protein, put. TGL65 Failure
PKNH_0423200 Protein kinase G TGL66 Failure
Table 8: Result of pJAZZ transfection attempts
It is striking that every HA-tagging attempt was successful, whereas
only one of the knock-outs attempted was. This, combined with the
4.5 transfection of plasmogem vectors 129
failure to knock-out negative control PKG, gives a first indication that
these experiments are producing meaningful results.
4.5.3.1 HA-tagging in P. knowlesi is efficient and permits immunofluores-
cence analysis
There are two factors meaning that we would expect tagging to suc-
ceed. C-terminal tagging with a small epitope, is unlikely to affect the
functionality of the majority of most proteins (though in rare cases it
might result in mislocalisation by interfering with a localisation sig-
nal). And secondly, tagging does not remove any part of the genome,
and so the length of the region which must be excised by host nucle-
ases is less than for a knock-out.
As a proof of concept that pJAZZ HA-tagging vectors could be
useful in P. knowlesi I decide to attempt immunofluorescent analysis
of apical sushi protein using the parasites that came up from the
TGL58 transfection which attempted to HA-tag apical sushi protein.
After a number of unsuccessful attempts involving methanol/acetone
fixation I used a paraformaldehyde fixation method [193] and was
able to observe an apical localisation (Fig. 52).
The organisation of P. knowlesi schizont, with merozoites radiating
from the centre of the parasites, appeared to prevent the resolution
of individual nuclei. In part due to the lack of P. knowlesi specific anti-
bodies and in part because the position of apical sushi protein in the
rhoptries is already well established, I did not attempt co-localisation.
4.5.3.2 Attempt to disrupt DnaJ protein PKNH_0941100 was successful
Quantitative PCR identified a very significant decrease in abundance
of the amplicon located within this gene, indicating that it has been
lost in the majority of the population. This makes it the first gene to
be knocked-out in P. knowlesi using a PlasmoGEM approach.
This success in P. knowlesi is in contrast to several attempts in P.
falciparum but in agreement with results from P. berghei. While it is
possible that—like DHHC3 below—this protein plays different roles
in the different species, the fact that it can now be knocked out in
two out of three species would probably justify renewed attempts at
deletion in P. falciparum using newly developed CRISPR approaches.
130 the transfer of plasmogem technology to P. KN OWLESI
BF HA DAPI Merge
(a) The flattened z-stack depicting a mature schizont, with punctate
staining labelling what are presumably the rhoptries.
DAPI HA
Combination
(b) Three-dimensional deconvolution of z-stack
Figure 52: Immunofluorescence of 3xHA-tagged apical sushi protein in P. knowlesi
reveals the expected apical localisation.
4.5 transfection of plasmogem vectors 131
260
270
280
290
G
I
K
T
F
F
E
W
I
I
I
D
K
K
R
L
K
K
N
V
S
Q
S
E
Q
D
I
E
N
L
K
V
E
A
-
E
R
S
I
K
Y
-
G
I
R
T
F
F
E
W
L
I
I
D
K
K
R
L
R
K
S
H
A
E
-
N
Q
D
I
E
I
Q
D
I
E
R
-
V
M
S
L
K
S
-
K
I
K
T
F
F
D
W
I
I
I
D
K
K
R
S
K
R
S
Q
N
L
-
D
Q
D
M
E
K
H
E
I
E
R
G
E
I
T
L
K
N
Y
5
*
9
*
*
*
9
*
+
*
*
*
*
*
*
5
9
9
9
3
7
5
-
7
*
*
9
*
4
4
6
9
*
4
-
4
5
9
+
*
6
-
P. berghei
P. knowlesi
P. falciparum
Conservation
Figure 53: The C-terminus of PkDHHC3
While the DHHC3 genes of P. knowlesi, P. falciparum and P. berghei
are in general well conserved, and have the same evolutionary
relationship as their respective species (as shown in Fig. 47), the
C-terminus is not well conserved.
4.5.3.3 Attempt to distrupt DHHC3 was unsuccessful
I was not able to knock-out DHHC3. This is in accordance with pre-
vious attempts in P. falciparum, but not with those in P. berghei where
a knock-out was successfully made. [65] It is possible that the P. fal-
ciparum transfection failed because of the greater difficulty in geneti-
cally modifying this species, but the fact that DHHC3 is differentially
localised in P. falciparum and P. berghei [188] suggests they play dif-
ferent roles in these species. We do not know, of course, how the
P. knowlesi protein is localised. However there are suggestions from
research on the mammalian palmitoyltransferases suggests that C-
termini may be involved in DHHC localisation. [83] The C-terminus
of PkDHHC3 shows no greater homology to PbDHHC3 than it does
to PfDHHC3 (Fig. 53) allowing the possibility that like PfDHHC3,
PkDHHC3 may be a putatively essential Golgi-targeted enzyme.
It would be interesting to localise DHHC3 in P. knowlesi to shed
further light on these possibilities. This could be easily achieved with
PlasmoGEM tagging vectors.
4.5.3.4 Attempt to disrupt protein kinase G was unsuccessful
We could not detect any significant number of parasites with this
locus removed, as we would expect given it is the negative control.
Nevertheless drug-resistant parasites did come up in this transfection
and their origin must be investigated.
4.5.3.5 Attempt to disrupt PKNH_0406200 was unsuccessful
We could not detect any significant number of parasites with this lo-
cus removed, which is in accordance with a number of transfection at-
tempts in P. berghei but not with the postive integration PCR reported
in P. falciparum. Given the potential weaknesses of integration PCR
132 the transfer of plasmogem technology to P. KN OWLESI
Table 9: Summary of results for three genes for which knock-out attempts have been
made in three Plasmodium species
p. falciparum p. berghei p. knowlesi
PKNH_0941100 Refractory to deletion Deleted Deleted
PKNH_0918700 Refractory to deletion Deleted Refractory to deletion
PKNH_0406200 Integration PCR but uncloned Refractory to deletion Refractory to deletion
outlined in the chapter earlier, it may be worth revisiting this gene in
P. falciparum and attempting to produce a clone (with the help of new
CRISPR technology). If this does not succeed we can judge the gene
likely to be essential across the three species.
4.5.3.6 Attempts to disrupt PKNH_1207600 and PKNH_1306400 were
unsuccessful
We could not detect a signficant number of knock-out parasites for
PKNH_1207600 or PKNH_1306400. This means there is a possibility
these genes are essential or at least that their deletion has some ef-
fect on parasite fitness. But with a lack of data on the proportion of
targetable genes in P. knowlesi these suggestions are not conclusive.
4.5.4 Investigating the cause of resistant parasites without genomic modi-
fication
At this point it is clear that attempts to knock-out essential genes in P.
knowlesi can give rise to viable parasites which still possess the target
locus. I considered a number of possible explanations for this result.
4.5.4.1 Parasites have more than one copy of human dhfr per genome
One concern is that these parasites might have spontaneously de-
veloped pyrimethamine resistance, which requires only a single nu-
cleotide substitution [5]. To investigate this possibility I assayed what
proportion of parasites carried the human dhfr gene introduced by
my constructs or at least what the average copy number of this gene
was. I used PRIMER3 to design primers to the human dhfr gene, and
to the PkHsp70 UTR that is used to express it and performed qPCR
(Fig. 54).
4.5 transfection of plasmogem vectors 133
Parasite line
Relative abundance
58
59
60
61
62
63
64
65
66
0.0
0.5
1.0
1.5
2.0
2.5
WT primer pair
(Plasmo1, Plasmo 2)
dhfr
hsp70
Figure 54: Elevated dhfr copy number in non-integrating transfections suggests
the presence of episomes
Dhfr and hsp70 abundance in genomic DNA from each transfec-
tant is indicated. Primer pairs were normalised to each other by
setting to a relative abundance of 1 for the successful TGL58 trans-
fection.
The primer pairs in this analysis were normalised to a success-
ful transfection - TGL58, which we expect to contain one copy the
Plasmo1/Plasmo2 amplicon, one copy of human dhfr and two copies
of PkHsp70 UTR.
We can make two observations. Firstly it is clear that the resistance
marker is present in significant quantities in all samples sponta-
neous resistance is not at issue. Secondly, we can also see that four
out of five of the failed deletion attempts have more than 1 copy of
human dhfr per genome. (The Hsp70 amplicon appears at an inter-
mediate level because it is present at two copies in the control and so
each extra plasmid copy increases copy number by 50% of the con-
trol rather than 100%.) This presence of multiple copies could be the
result of multiple insertions into some site in the genome (perhaps
segmental duplications of the target locus), or the formation of epi-
somes (somehow from linear DNA).
I decided to attempt to investigate these possibilities further by
whole genome sequencing, but first I dilution cloned parasites in or-
134 the transfer of plasmogem technology to P. KN OWLESI
der to remove potential confounding factors from the analysis. I also
attempted to dilution clone the PKNH_0941100 knock-out to estab-
lish whether a clean knock-out could be generated.
Analysis of the result of dilution cloning parasites transfected with
this construct (Fig. 55) revealed that non-targeted parasites had been
effectively eliminated: a very low signal was seen for the unmodifed
locus, and melt-curve genotyping revealed this to be a different prod-
uct (data not shown), indicating the success of dilution cloning.
Parasite line
Relative abundance
63
-
b
ef
o
r
e d
i
l
ut
i
o
n c
l
o
ni
n
g
63
-
a
f
t
er
d
i
l
ut
i
o
n
cl
o
ni
n
g
W
T
10
-7
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1
Deleted region (405, 406)
Plasmo1,Plasmo2
(present in all parasites)
Figure 55: Quantitative PCR confirms attempt to dilution clone PKNH_0941100
knock out.
The apparent level of intact locus fell by more than 5 orders of
magnitude (and melt curve analysis revealed the remnant locus
apparently to be the result of non-specific amplification.)
4.5.5 Whole genome sequencing of transfectants
DNA was extracted from five successfully dilution cloned cultures
(TGL58, TGL60, TGL62, TGL63, TGL64). This DNA was labelled with
index tags and the 5 samples pooled on one lane of a MiSeq.
The sequencing data was analysed in three ways.
4.5 transfection of plasmogem vectors 135
4.5.5.1 Barcode verification
Once again, a search was made for barcodes to verify that no cross
contamination had occurred. The grep command was used to search
raw reads for the barcode primer sequence, and the barcode following
this was checked to ensure it corresponded to the intended gene. In
each case it did.
4.5.5.2 Mapping
These genomic DNA reads were mapped with bowtie2 to the P. knowlesi
genome, and some of the common heterologous sequences used in
the transfection, as described in section 2.2.15.2.
This method of analysis is capable of detecting SNPs and copy-
number changes, as well as giving some hints as to the location of
indels. As anticipated, the DNA extracted from parasites in which
we had attempted to knock out PKNH_0941100 had essentially no
coverage in the knocked-out region of the gene (Fig. 56).
Figure 56: Whole genome sequencing reveals successful gene deletion at the
PKNH_0941100 locus (highlighted in red)
The other two sequenced knock-outs, in which qPCR data had sug-
gested that the knock-out had not succeeded, had very different pat-
terns of coverage to this clean knock-out.
In both cases (Figs 57 and 58), the level of coverage at the gene
itself was in line with coverage at any other arbitrary location in the
genome. However regions on either side of the gene had elevated
coverage.
136 the transfer of plasmogem technology to P. KN OWLESI
Figure 57: Whole-genome sequencing reveals a failure to delete PKNH_1207600
(highlighted in red).
The green line indicating the coverage from the parasites of in-
terest, transfected with a pJAZZ vector targeting PKNH_1207600.
Their coverage peaks in regions either side of the gene, rising
to more than 2x normal coverage. Other colours indicate control
populations (parasites transfected with other constructs).
There are two theoretically possible origins for this increased cov-
erage. The first possibility is that a segmental duplication occurred
before the targeting construct was incorporated. This would have al-
lowed one of these copies to be knocked out without affecting par-
asite fitness, returning coverage to 1x in that region. The alternative
possibility, given that the entirety of the region at higher coverage is
contained within the homology region that exists on the vectors, is
that the increased coverage may come merely from additional copies
of the vector. These would be present in some form that does not
involve knocking out the target locus: i.e. either an episome or a non-
targeted integration.
The synthetic sequence I constructed for mapping to was made
up of the P. knowlesi genome concatenated to the selection cassette
and this enables the copy number of the cassette to be estimated.
Fig. 59 shows a difference here between the two failed attempts, one
to knock out PKNH_1207600 and one to knock out DHHC3. The copy
number of the selection cassette is much more than 1x in the case of
PKNH_1207600 but at about 1x for DHHC3.
It was necessary to understand the context in which the additional
reads corresponding to the homology regions were found. This could
4.5 transfection of plasmogem vectors 137
Figure 58: Whole genome sequencing reveals failure to target DHHC3.
The targeted gene is shown in red, the blue coverage line is from
the attempt to knock it out and has coverage which increases in
the region either side of the target gene.
indicate whether they corresponded to episomes or to some duplica-
tion event. I started to examine the subset of reads that did not come
from ‘proper pairs’. Proper pairs are reads which map as expected, to
opposite strands a small distance apart. When I investigated reads at
the breakpoint of the increased coverage I found a preponderance of
reads whose mates did not map anywhere in the P. knowlesi genome,
nor to the gateway cassette. I took the sequence of one of these mates
and queried it against the NCBI’s non-redundant nucleotide collec-
tion. The unexpected result was that this sequence belonged to the
backbone of the pJAZZ vector. This result was unexpected because
the PlasmoGEM protocols I followed called for the pJAZZ arms to be
removed by NotI digestion. However in retrospect this should have
been less of a surprise since we know from running the digest by gel
electrophoresis that NotI digestion is substantially incomplete (even
when the amount of DNA is substantially reduced).
I constructed a new synthetic genome to map to, this time contain-
ing the pJAZZ arms, and remapped reads to this new supercontig.
The mapping results for the five whole-genome sequenced clones
are shown in Fig. 60. Much, but not the entirety of the long pJAZZ
arm is present for PKNH_1207600, but the entire short arm is absent.
Additionally there is a step-change in coverage at one point in the
arm. This must represent a site of rearrangement with multiple copies
of the outer region of the long arm, but the exact structure is hard to
deduce because of the repetitive nature of the sequence.
138 the transfer of plasmogem technology to P. KN OWLESI
Genomic region Selection cassette
Figure 59: Coverage of selection cassette as compared to the general genome in
whole genome-sequenced samples.
All samples show similar coverage with the exception of the at-
tempt to knockout PKNH_1207600 (pink), which shows greatly
elevated levels of the selection cassette.
Fig. 60 demonstrates that successfully targeted parasites lack any
sequences from the pJAZZ arms (as one would expect from double
crossover), and that the other failed knock-out analysed, DHHC3 also
posesses only the pJAZZ long arm. Interestingly, a greater length of
the long arm is included and there is a coverage increase that begins
at a different point, illustrating that the presumed recircularisation
event that gives rise to episomes does not occur in only one way.
The presence of the long pJAZZ arm seems to be a common factor
in the transfectants which come up without targeting the correct lo-
cus, sometimes with noticeable copy number changes. This suggests
to me that P. knowlesi is able to recircularise linear DNA with which
it is transfected and maintain it, although it is not impossible that
a concatomeric form of the cassette is incorporated into the parasite
genome.
4.5.6 pJAZZ transfection results in P. knowlesi are reproducible
Finally, to identify whether the transfection results I obtained were
repeatable I transfected 7 of the constructs (TGL58-TGL65) an addi-
tional time. On this occasion I lowered the amount of DNA trans-
fected to 3 µg to see whether I could reduce the number of non-
targeting events which occurred. The same four successful modifica-
tions were achieved (TGL58, TGL60, TGL61, TGL63), as assessed by
4.6 towards crispr/cas9 in p. knowlesi 139
pJazz long arm pJazz short arm
TGL62
TGL64
No reads:
TGL58
TGL60
TGL63
Figure 60: Genome sequencing reveals transfectants still have regions from
the pJAZZ backbone, despite NotI digestion prior to transfection.
qPCR. Apparent episomes were formed in all other cultures except
TGL62, where parasites did not come up.
This suggests that reduced DNA concentration, or perhaps reduced
transfection efficiency achieved in any way, may have a role in increas-
ing the specificity of targeting. The reproducibility of the successful
results is encouraging.
4.6 towards crispr/cas9 in p. knowlesi
The data from [142] shown in Fig. 6 suggest that P. knowlesi appears
to be the malaria parasite species best able to take up exogenous
DNA. This effect is shown by its maximal transfection efficiency of
30% [142], as well as its consistent ability to give rise to drug-resistant
parasites after transfection shown in this chapter. However the speed
with which vectors can be integrated into the parasite does not par-
allel this. Parasites with integrated DNA come up 11 days later than
those with episomes [142] and I estimated in the introduction, based
on this data that just 1 in 5,000 parasites that take up DNA result in
an integration event.
Clearly any strategy to increase this proportion would be very use-
ful both to increase the speed with which modified parasites could
be generated in P. knowlesi and perhaps to lower the rate of episome
formation seen in this chapter.
140 the transfer of plasmogem technology to P. KN OWLESI
One of the limiting factors on the integration of DNA into the par-
asite genome may be the formation of a double-strand break in the
genomic locus to be targeted. In P. falciparum the use first of zinc-
finger nucleases [184, 144] and then very recently of CRISPR/Cas9
[212, 72, 201] has permitted greater integration efficiencies and the
isolation of correctly targeted parasites without drug cycling.
I decided to attempt to bring this technology to P. knowlesi, hoping
that the combination of a high transfection efficiency with techniques
to boost the rate of integration might lead to an ideal genetic system.
In this section I will describe the progress made towards this goal.
4.6.1 Bio-informatic prediction of cutting efficiency of guide RNAs target-
ing the P. knowlesi genome
Whereas a number of online guide-RNA design tools support P. fal-
ciparum, none were available which supported P. knowlesi as I began
this project. In order to ultimately use CRISPR at scale, and also to
analyse its potential in P. knowlesi, I decided to create a streamlined
process for designing sgRNAs for P. knowlesi. There are three consid-
erations for the design of guide RNAs: potential off-target sites, the
on-target cutting effiency and the location of the break.
Based on BLAST searches of candidate guide RNAs, I hypothesised
that, given the much reduced size of the Plasmodium genome as com-
pared to that of mammals, off-target cutting was unlikely to be a con-
cern the higher (G+C)-content of P. knowlesi also should minimise
off-target activity as compared to P. falciparum. In addition, given
that Plasmodium does not possess machinery for non-homologous end
joining, if off-target cutting occurs in occasional parasites it will sim-
ply result in their deaths rather than in non-targeted insertion events.
I therefore focused on on-target cutting efficiency and break-location.
Algorithms to estimate on-target cutting efficiency based on se-
quence have been created for mammalian systems [55]. While it is
not clear whether these factors will apply to Plasmodium also, if they
simply describe the interaction between the Cas9 protein and DNA
then it is seems likely that they might. In the absence of specific Plas-
modium data on on-targeting efficiency, I decided to use this scoring
system.
4.6 towards crispr/cas9 in p. knowlesi 141
0
10000
20000
30000
40000
0.0 0.2 0.4 0.6 0.8 1.0
On−target score
Frequency (sgRNAs)
Figure 61: Distribution of on-target cutting scores for all possible guide RNAs in
the P. knowlesi genome
Plotting the scores from all 1.25 million possible sgRNAs in
P. knowlesi indicates that the majority have scores associated
with potentially low on-target cutting efficiency. Nevertheless the
small proportion of high-scoring guides still represents a sizeable
number.
I wrote a script to identify, and score, every possible guide RNA in
the P. knowlesi genome. The strategy employed was to search through
a Fasta file containing genomic sequences for every P. knowlesi gene
(downloaded from PlasmoDB) and to identify every PAM (Proto-
spacer adjacent motif) site, and the corresponding guide RNA se-
quence. Then it also extracted the 30-mer containing this sequence
in a genomic context and calculated the on-target score according to
[55].
4.6.1.1 Theoretical feasibile scale of CRISPR in P. knowlesi
My analyses identified 1.25 million candidate guide RNAs within the
coding sequences of P. knowlesi. The vast majority of these have low
on-target scores as shown in Fig. 61. This is also reported in mam-
malian systems and in fact Fig. 62a shows that there are a higher pro-
portion of high-scoring RNAs in P. knowlesi than in M. musculus this
may be because of a reported ’strong bias against guanine immedi-
ately 3 of the PAM’[55] which would be reduced in an (A+T)-rich or-
ganism such as P. knowlesi. In practice one is typically using only one
or a handful of guide RNAs at a time, so the most important consid-
eration is what proportion of genes are targetable by high-efficiency
guides.
142 the transfer of plasmogem technology to P. KN OWLESI
0.81.0
0.60.8
0.40.6
0.20.4
00.2
M. musculus P. knowlesi
(a) The left hand side of this plot is adapted from Doench et al. (2014) The
right hand side was calculated by me based on the full 1.25 million sgR-
NAs that exist within the genes of P. knowlesi. [55]
0
200
400
600
800
0.0 0.2 0.4 0.6 0.8 1.0
Maximum on−target score for gene
Frequency (genes)
First base
(Any)
G
(b) Histogram showing the distribution of genes according to their maxi-
mum on target score calculated with the Doench et al. algorithm. Both
the distribution allowing any starting base, and with free choice of start-
ing base are shown.
Figure 62: Distribution of predicted CRISPR on-target cutting scores compared to
mammalian systems and analysed on a per-gene basis indicates a large
potential for CRISPR in P. knowlesi.
In order to consider this I selected the highest scoring guide RNA
within each gene and then considered the distribution of genes (Fig. 62).
This revealed that the vast majority of genes had at least one high-
scoring guide. The use of the U6 promoter requires that transcription
begin, as in the native U6, with a guanine residue (although it is pos-
sible to add a non-binding G to the 5 end of the sequence if required),
imposing this demand reduced the number of genes with high scores
but still left 88% of genes with a maximum score greater than 0.6 and
47% with a score greater than 0.8.
In subsequent work I prioritised guide RNAs which received scores
greater than 0.8: 80% of these should be in the top quintile
5
for activ-
5 A quintile is a 20% proportion of ranked data, i.e. 20 percentiles
4.7 generation of cas9 mother vectors 143
ity and the remainder in the second and third quintile according to
the analysis in M. musculus.
4.7 generation of cas9 mother vectors
The starting point for the CRISPR constructs was a pDC2-based vec-
tor expressing Cas9, and the sgRNA from a U6 promoter, developed
by Marcus Lee. Since there are no data as to whether the P. falciparum
U6 promoter is functional when used heterologously in P. knowlesi, I
exchanged this promoter for the corresponding 500 bp sequence in
P. knowlesi. A PCR reaction was used to amplify the P. knowlesi U6
promoter with overlaps corresponding to the insertion site in the des-
tination vector (primers: TPR458, TPR459). The P. falciparum construct
was digested with SalI and BbsI to remove the P. falciparum U6 pro-
moter and the PCR amplicon assembled into the backbone by Gibson
Assembly.
Two versions of the construct were produced, one with yDHODH
as the resistance marker (based on a construct from Zenon Zenonos)
and one with hDHFR. A map is shown in Fig. S2.
These constructs contain sites for the type IIS restriction enzyme
BbsI which allow the precise insertion of guide-RNA-encoding DNA
sequences, which are produced by annealing two oligos with 5 over-
hangs. Cas9-guide RNA constructs were made for use in tandem with
each of the knock-out constructs described in this chapter, by anneal-
ing desalted oligos and ligating with T4 ligase into the BbsI digested
vector.
4.8 enhancing pjazz transfections with crispr/cas9
An initial proof of concept experiment was carried out with the pJAZZ
vector well-established to target PKNH_0941100. This vector, which
was used several times in section 4.5 to successfully delete this gene
was transfected alone, and also with a cassette containing Cas9 and
a guide RNA targetted to PKNH_0941100 under the P. knowlesi U6
promoter. The version of the construct used yDHODH rather than
hDHFR, and no selection was applied with Dsm1 for the presence
144 the transfer of plasmogem technology to P. KN OWLESI
of the Crispr cassette, in part because of the potential problems with
this marker in P. knowlesi outlined earlier in the chapter.
transfection pjazz / ug crispr / ug
pJazz alone 15 0
pJazz & Crispr 15 30
Selection was applied with 100nM pyrimethamine on day 2 post-
transfection. Parasitaemia rapidly decreased upon selection but para-
sites were apparent in both transfections by day 10 post-transfection.
On day 12 after transfection parasitaemias were as follows:
transfection day 12 parasitaemia
pJazz alone 0.28%
pJazz & Crispr 1.2%
DNA was extracted and used for qPCR analysis to detect the level
of ablation of wildtype locus. The results are shown in Fig. 64.
The fact that parasites came up more rapidly, and with lower levels
of intact locus, compared to a typical pJAZZ transfection provides en-
couragement that CRISPR/Cas9 technology may substantially boost
both the throughput and the targetability of P. knowlesi genes by
pJAZZ vectors. However further experiments are needed to prove the
reproducible effect of the technology in this system.
Experiments conducted to attempt to select for the Cas9 vector
rather than an integrated resistance marker, and thus prove the se-
lective force of Cas9-mediated DNA cleavage have not yet been suc-
cessful.
4.8 enhancing pjazz transfections with crispr/cas9 145
Crispr binding site
PKNH_041100 (DnaJ protein)
500
0
1000
Figure 63: Position of Crispr/Cas9 cutting site within PKNH_0941100, a
gene already known to tolerate deletion in P. knowlesi from work
earlier in this chapter.
PKNH_0941100 amplicon
Plasmo1/Plasmo2
0%
25%
50%
75%
100%
Wild−type
pJazz alone
pJazz + Crispr
Wild−type
pJazz alone
pJazz + Crispr
Sample
Locus presence
Figure 64: qPCR data from a single experiment demonstrates lower levels of wild-
type locus in a pJAZZ transfection when supplemented by Crispr
Results are normalised for primer efficiency and absolute DNA
content. Error bars represent 95% confidence intervals calculated
from technical triplicates from qPCR rater than biological repli-
cates)
146 the transfer of plasmogem technology to P. KN OWLESI
Parasite without
DNA
Parasite with
unintegrated
linear DNA
Parasite with
recircularised
episomal DNA
Dead parasite
Parasite with
integration
Transfection
Integration
Selection
Circularisation
Unstable
segregation
Replication
Replication Replication
Replication
Figure 65: The possible paths parasites can take after transfection are many and
various
While use of linear DNA is sometimes thought of as simplifying
the process of transfection there are still many possible avenues
for transfectant parasites.
4.9 discussion
After efforts to optimise the culture, transfection and selection of P.
knowlesi, it is clear from these results that linear pJAZZ vectors can
be integrated into the P. knowlesi genome, creating desired targeting
events. On the other hand, it is evident that an alternative event is also
possible whereby vector DNA is carried without correct integration
into the genome.
I have not established what the ratio between these events is, nor
the order in which they happen. One possible explanation for P. knowlesi’s
ability to be transfected with linear DNA is that it may recircularise
this linear DNA rapidly, allowing it to carry and replicate this until
a spontaneous DSB in the target gene creates the right conditions for
integration.
A summary of the possible dynamics that can occur during the
process of transfection and selection is shown in Fig. 65, but the pro-
portion of parasites making each transitition is unclear. Unstable seg-
regation of episomes pose a significant fitness costs on the parasite.
[147] So one model that fits the data is that in each transfection both
episomes and integrants are formed and these then compete to take
4.9 discussion 147
over the population. Where integrants have wild-type fitness (as in
the HA-tags and with blood stage redundant genes), they take over
the population. But if they have a significant fitness cost, this may out-
weigh the weaknesses in episomal segregation resulting in episomes
taking over the population. Presumably here integrants will still be
formed continually at a low level but they will never take over the
population.
The ability to efficiently epitope tag proteins of interest, combined
with P. knowlesi’s large merozoite size, may make it a powerful system
for the localisation of invasion-related proteins.
In short the ability to transfect with linear DNA is not, in P. knowlesi,
a silver bullet which ensures that only parasites bearing integrations
will come up, but genetic modifications with pJAZZ vectors are cer-
tainly possible in this parasite.
4.9.1 Prospects for barseq
Where does this leave the idea of barcode sequencing in P. knowlesi?
One important discovery is that barcodes can be carried by P. knowlesi
parasites without the gene they target being knocked out. Clearly it
means that at present barseq will be somewhat more cumbersome in
this species transfections would have to be carried out for each vec-
tor individually, assessed for successful modification by qPCR, and
possibly dilution cloned before being pooled for a barseq experiment.
However none of these challenges are insurmountable in an in vitro
system and such an approach would allow complete confidence in a
gene called as "redundant", avoiding any false positives due to locus
duplication.
On the other hand, it is possible that because of the formation of
episomes we will be unable to obtain populations of parasites with
significant fitness defects, these having been outcompeted by their
episomal counterparts. Without mutants with a range of fitness there
would be no differences for barseq to detect.
One key question will be whether technological changes allow the
rate of false-positive formation to be reduced.
148 the transfer of plasmogem technology to P. KN OWLESI
4.9.2 Possible optimisations to increase proportion of parasites with a tar-
geted genome
4.9.2.1 Complete digestion of pJAZZ arms
In both the episomes I sequenced the pJAZZ long arm was present.
This is only circumstantial evidence that the presence of the arms
prevents integration or promotes recircularisation. It is possible that
it just so happens that a majority of the DNA transfected was undi-
gested and that recircularisation would have occurred regardless. How-
ever it is conceivable that repetitive sequences within the pJAZZ telom-
eres might promote circularisation, or that chew back by exonucle-
ases is important to allow integration and that the pJAZZ arms pre-
vent this. Using smaller amounts of DNA might allow digestion of a
greater proportion of vectors and promote integration.
4.9.2.2 Reducing initial population of parasites
I showed in my second round of transfections that dropping the DNA
concentration roughly tenfold resulted in false-positive living para-
sites. Ironically this may mean that all my efforts to optimise transfec-
tion served to increase the rate of false-negatives. Provided that the
rate of integration of linear DNA is higher than the rate at which is
recircuarises, it might be possible to obtain clean populations of mod-
ified parasites by tweaking the transfection effiency such that zero
recircularisation events occur. However achieving this balance for ev-
ery vector in every transfection is likely to be difficult.
4.9.2.3 Negative selection against episomes
In the episomes I sequenced, the pJAZZ long arm was present. Had
this backbone contained a negative selectable marker I would have
been able to kill parasites carrying the episome and thus select only
integrants. Negative selectable markers can undergo a variety of mu-
tations to become non-functional, however, which might undermine
such a strategy.
4.9 discussion 149
4.9.2.4 CRISPR/Cas9
Finally CRISPR/Cas9, which has only been brought to Plasmodium
very recently, might be able to increase the rate at which exogenous
DNA, whether linear or circular, is incorporated into the P. knowlesi.
This might allow increased transfection efficiency without a concomi-
tant increase in mistargeting events.
My preliminary data suggests that inclusion of Cas9 increases the
efficiency of correct targeting, at least at the one locus tested.
4.9.3 Conclusion
As has been previously observed, we found that human-adapted P.
knowlesi offers a high transfection efficiency and can be transfected
with linear DNA. This has enabled the use of this system to bring
PlasmoGEM vectors to an in vitro system for the first time.
The high-efficiency of endogenous tagging by P. knowlesi vectors
suggests an immediate application for protein localisation. The large
size of merozoites in P. knowlesi might also allow increased through-
put of localisation experiments with better resolution of merozoite
substructure than those in other species.
The use of PlasmoGEM vectors in P. knowlesi enabled tagging of ev-
ery gene attempted, and a successful knock-out of a putative invasion-
related gene previously refractory to deletion in P. falciparum. How-
ever given the observation of parasites carrying barcodes even after
the targeting of essential genes, direct use of the P. berghei Plasmo-
GEM strategy of pooled transfection is not appropriate for P. knowlesi
and more development will be needed to establish a combined qPCR-
barseq assay which checks for targetability prior to pooling barcoded
vectors and tracking their growth through time.
More development is needed to fully realise the potential of Plas-
moGEM genetics in P. knowlesi, but the parasite is already a useful
model for many applications.
5
N E W A P P R O A C H E S A N D TO O L S F O R L A R G E
S C A L E P H E N O T Y P I C A N A LY S I S I N S I L I C O
5.1 barseq in p. berghei
Chapter 3 of this thesis described the largest screen to date of pu-
tative invasion-related genes, in Plasmodium berghei. The intervening
pages have described progress made towards extending this high-
throughput approach to an in vitro system in Plasmodium knowlesi.
But to fully leverage the large datasets that result from such analyses,
new approaches and tools are needed for analysis of the data pro-
duced at scale. This chapter will outline my approaches to some of
these challenges, using the P. berghei invadome dataset and curation
of the invasion literature.
5.2 network-based analysis of the invadome
I have already discussed the individual gene phenotypes for the barseq
dataset I derived in Chapter 3 but, since the ’core invadome’ was de-
rived from an interaction network based on gene expression patterns,
I also wanted to investigate whether combining this interaction data
with the mutant phenotype data would result in a more detailed pic-
ture of the genetic architecture underlying invasion. This is especially
important given how many of these genes have no annotated func-
tion.
5.2.1 Reconstruction of PlasmoINT network
The first pre-requisite for such an analysis was to re-build the Plas-
moINT network in a form in which I could analyse it. I queried
the PlasmoINT database [92] for the connections and interaction co-
efficients for every P. falciparum gene and stored this dataset in a tabu-
151
152 new approaches and tools for large scale phenotypic analysis IN SILIC O
lar form, listing the two gene nodes, and the likelihood score of their
being some interaction between them according to the database.
0
40000
80000
120000
1 2 3 4 5
Log10(Likelihood Score)
Number of edges
(a) Subsetting of 90% confidence-
network (blue) from overall net-
work (grey).
0
100
200
300
400
0 250 500 750
Connections / gene
Frequency
(b) Connectivity of 90% confidence net-
work.
Figure 66: Selecting the 90% confidence subnetwork results in a network with rel-
atively low connectivity for most genes.
These data comprise the ‘edge-list’ for a network with 410,898 con-
nections (edges) linking a total of 4,655 genes (nodes) each with a like-
lihood score reflecting the degree of confidence that there is some link
between these proteins. In order to increase stringency I also isolated
the subset of these edges with likelihood-scores sufficient to qualify
for the 90% confidence network described in Hu et al. (Fig. 66b). This
reduced the number of edges to 145,622 and the number of genes in-
cluded to 3,466. Because the ’invadome’ I analysed was based on this
subset of the network, however, all 418 genes in it are still included.
In order to verify that I had recapitulated the network in a way
that was true to Hu et al., I examined the entire 90% confidence
network with the Gephi graph-visualisation package. The network
was initiated with nodes placed in random positions and then the
Fruchterman-Reingold algorithm used to iteratively reposition nodes
until the energy of the graph was minimised.
This approach is a form of force-directed graph drawing. Nodes are
initially randomly positioned in a compact space on a two-dimensional
plane. It is assumed that each node repels all other nodes with a cer-
tain force, but interactions between nodes are represented as ’springs’
which pull these nodes together. The position of nodes is then itera-
tively simulated and they expand out from the compact space, even-
tually settling in a pattern which best satisfied (at least as a local
maximum) the conflicting demands of the attracting and repelling
forces.
5.2 network-based analysis of the invadome 153
Invadome
Non-invadome
Figure 67: Invasion sub-network in context of entire 90% confidence network
Fruchterman-Reingold transformed 90% confidence network
from Hu et al. with the invadome highlighted in blue. The cluster-
ing of the invadome illustrates the network has been successfully
reconstructed.
When this graph was visualised the invadome did indeed cluster
well (Fig. 67), which suggested the network had been successfully re-
built. However these visualisation techniques project hundreds, or
thousands, of dimensions onto a two-dimensional plane and thus
there are inevitable compromises made in placement. This explains
the scattering of non-invadome nodes within the invadome, and should
remain in mind throughout this chapter.
5.2.2 Overlaying localisation data on the PlasmoINT network
Now confident that the network had been reproduced, I next wanted
to look within the network of the invadome module itself. Hu et al.
conducted a series of analyses to show that the genes in this set were
enriched for invasion-related localisations, but I wanted to investigate
whether there was sub-structure within the data. I therefore selected
only the subnetwork containing genes which were part of the 418
gene invadome and repeated the Fruchterman-Reingold layout algo-
rithm. In order to visualise potential substructure I downloaded local-
isation data from my PhenoPlasm database (which will be described
in detail later) and encoded the localisation of the network genes by
colour.
154 new approaches and tools for large scale phenotypic analysis IN SILIC O
Figure 68: Separation by localisation in the invadome subnetwork
In this figure the 418 genes of the Hu et al. invadome are laid out
using the 90% confidence PlasmoINT network. They have been
coloured according to their localisation.
This analysis, an example of which is shown in Fig. 68, seems to
show clustering by localisation in that the secretory organelles, the
rhoptries and the micronemes, are relatively well separated. However
there is intermingling with other invasive localisations such as the
inner-membrane complex.
One interestesting node here is PF3D7_1143200, the DnaJ protein
whose orthologue I knocked out in P. knowlesi in Chapter 4, which
in P. falciparum had contradictory localisation signals implicating it
as being either either in the rhoptries or the micronemes [89]. This
gene is positioned in the network in a fashion highly suggestive of
rhoptry localisation. Furthermore, when we examine the individual
connections that this protein makes, 8 are to rhoptry proteins and
0 to micronemal proteins. These observations are an example of the
power that pooling these data sources can provide.
5.3 large scale analysis of mutant phenotypes
Now with access to both a large set of phenotyping data, and some
of the network dynamics that may underlie it, I investigated three
broad hypotheses about the structure of the phenotype dataset: the
centrality-lethality rule, the correlation of phenotypes for co-expressed
genes and the connectedness between the relationship between the
phenotypes of regulator knock-outs and those of their targets.
5.3 large scale analysis of mutant phenotypes 155
5.3.0.1 Centrality-lethality
The centrality-lethality rule [98] states that the more interactions a
protein makes with other proteins, the more likely it is to be essential.
It was proposed on the basis of analysis of S. cerevisiae interaction
networks and has since been found to apply also to E. coli and H.
sapiens. [106].
Possessing a large dataset of both phenotypes and putative inter-
actions I decided to investigate whether I could detect evidence for
centrality-lethality in Plasmodium. The phenotype data is from P. berghei
and the interaction data from P. falciparum but as I am investigating
core conserved invasion genes I expected each datum to be represen-
tative of both species. I compared the connectivity (i.e. the number
of connected nodes) in the 90% confidence PlasmoINT data of es-
sential and non-essential genes as defined by the invadome barseq
data (Fig. 69). There is no significant difference, on average, between
the connectivity of essential and non-essential genes. However when
one looks at the 6 most-connected genes (circled in the figure) it is
striking that every single one is essential, and this is unlikely to be a
coincidence.
0
100
200
300
400
Essential Non−essential
Mutant phenotype
Connectivity
Essential
Non−essential
Figure 69: Testing centrality-lethality in Plasmodium invasion genes
While there is no significant difference in the mean connectivity
of essential and non-essential genes it is noteable that the 6 most
connected genes are all essential.
This analysis does not provide strong statistical evidence that the
central-lethality rule applies across the board in Plasmodium, but it
has been noted previously that high-quality protein-protein interac-
tion data is needed to detect the relatively weak effect of centrality-
156 new approaches and tools for large scale phenotypic analysis IN SILIC O
lethality no such data currently exist for Plasmodium. The hints of
centrality-lethality in our data, as well as the widespread nature of
the phenomenon is in other organisms, suggest that the lack of ev-
idence in this dataset may be a matter of insufficient power rather
than a feature of Plasmodium biology.
5.3.1 Correlated mutant phenotypes for ’interacting’ genes
In Fig. 68 we saw that the invadome subnetwork is structured, in
part, according to proteins’ subcellular localisation. Now that I had
brought the knock-out phenotypes of genes into the model I sought
to investigate whether there might also be structure on the basis of
the PlasmoINT network.
There are a priori reasons to believe that such structure might ex-
ist. We know that in P. falciparum there are redundant and essential
pathways within the invasion process. If modules like these appear,
in some form, in the network then we might see structure.
For this analysis, I selected a further sub-network from the Plas-
moINT network, selecting only those genes in the 320 gene core in-
vadome. I then laid out this network with the ForceAtlas algorithm
which adjusts results for the connectivity of a node (though sim-
ilar results are obtained with Fruchterman-Reingold) and this time
encoded mutant phenotype by colour.
The result (Fig. 70a and Fig. 71) did appear to have structure, with
essential genes seemingly enriched at one ’end’ and redundant genes
at the other. Since humans are apt to spot apparent patterns in ran-
dom data (’apophenia’) [66]. I wanted some measure of signficance,
and so placed a line of division where phenotypes appeared to be best
separated. This yielded the split in phenotypes shown in Fig. 70b. The
enrichment for essential genes in the upper portion is significant by
the hypergeometric distribution with an initial p-value of 6.2 ˆ 10
´5
.
However since the data were partitioned at the most significant point,
correcting for multiple comparisons is necessary. Methodology for
this situation is unclear but given the low p-value, a conservative Bon-
feronni correction would permit me to draw 800 such lines and test
each by the hypergeometric distribution to retain experiment-wide
5.3 large scale analysis of mutant phenotypes 157
Redundant
Slow
Essential
Unknown
Mutant phenotype
(a) Core invadome subnetwork with barseq mutant phenotypes indicated by colour.
Above ‘the line’
Below ‘the line’
(b) All phenotyped genes above and below the line marked in panel A, laid out side
by side to enable visualisation of the magnitude of the difference in ratios.
Figure 70: Analysing the invadome subnetwork according to mutant phenotype
158 new approaches and tools for large scale phenotypic analysis IN SILIC O
Figure 71: The core invadome coloured by P. berghei barseq mutant phenotypes,
and labelled
5.3 large scale analysis of mutant phenotypes 159
Table 10: Contingency table showing network edges grouped by mutant phenotypes
at either node
target
Essential Slow Redundant
source
Essential 1310 269 859
Slow 269 48 165
Redundant 859 165 722
significance at p ă 0.05. Therefore we can conclude that the non-
random distribution is indeed significant.
However as I have noted previously these visualisations are impre-
cise representations of complex network data in two-dimensions. I
next sought to pull apart the network into its individual nodes and
connections, and observe whether phenotypes here were correlated
based on interaction. My initial approach to this was to select all
edges within the 90% confidence core invadome which connected
any two genes with barseq phenotypes. I then examined these edges
and binned them into categories based on the source node: redun-
dant, essential and slow growth. For each edge in each of these groups
I then examined the other, ‘target’, node. I counted the number of
redundant, essential and slow-growth targets for each of the source
categories, producing the data shown in Table 10.
A χ
2
test performed on this contingency table gives p 0.00063.
This demonstrates that the mutant phenotype of the node at one side
of an edge is not independent of the phenotype at the other. The di-
rection of the differences, and their magnitude and significance can
be seen in Fig. 72. There is an assortative trend, as was seen on the net-
work diagram, with redundant nodes tending to be more connected
to redundant nodes and essential nodes tending to be more connected
to essential nodes. (Attenuated-mutant nodes are more likely to be
connected to essential nodes.)
While this lack of independence is highly significant, the effect size
is small. Under the null hypothesis 653 redundant–redundant edges
would be expected but we observe 722. This is a discrepancy of only
10%.
However this analysis considered each edge singly. I next consid-
ered each node on the basis of all nodes connected to it. For each
160 new approaches and tools for large scale phenotypic analysis IN SILIC O
−1.8
−1.5
−1.0
0.0
1.0
1.5
2.0
2.7
Pearson
residuals:
Target node
Source node
Redundant Slow Essential
Essential Slow Redundant
Figure 72: Association plot demonstrates that connected nodes tend to have the
same mutant phenotype
This is a Cohen-Friendly association plot demonstrating a lack
of independence between the phenotypes on either end of a Plas-
moINT connection. The width of each cell represents the expected
size of the cell and the height the signed difference between ob-
served and expected. Edges with redundant genes at one end are
more likely to have a redundant gene at the other end also than
would be expected by chance.
node I counted the number of connected essential nodes (e) and the
number of connected redundant nodes (r). I then computed a simple
metric, which I will call the essentiality ratio (R).
R
e
e ` r
I calculated R for all genes which were themselves essential or re-
dundant, and excluded genes where e ` r ă 3 since for these the
metric is likely to be very noisy. The mean R for essential and re-
dundant genes was 0.63 and 0.53 respectively, and significantly dif-
ferent (p 1.871 ˆ 10
´5
). This increased significance suggests that
the essentiality-ratio approach adds more power.
I compared the distribution of essentiality ratios for redundant and
essential genes (Fig. 73). The populations of redundant and essential
genes do appear to have offset essentiality ratios, but the differences
appear most profound at the extremes. In the barseq data any gene in
which more than 82% of phenotyped connected genes were essential,
was itself essential (there were 7 such genes) and any gene where less
than 37% of connected genes were essential was redundant (there
were 3 such genes).
I was interested in the possibility that phenotypes might be able
to be predicted for invadome genes for which we do not yet have
an established phenotype, so I examined those genes satisfying these
5.3 large scale analysis of mutant phenotypes 161
limiting criteria (i.e. would not give false-positive results in the ex-
isting data) in the set of genes not yet phenotyped by barseq. This
allowed the prediction of mutant phenotypes for 23 further genes,
which exceeded these limits and had no known barseq phenotype
(Table 11).
Unknown
Essential
Redundant
0.00 0.25 0.50 0.75 1.00
Essentiality ratio
Barseq mutant phenotype
Figure 73: Essentiality ratios for invadome genes clustered by barseq mutant phe-
notype.
Two lines are marked, one at the minimum essentiality ratio for
essential genes, the other at the maximum essentiality ratio for re-
dundant genes. These were used for predicting the possible phe-
notypes of unknown genes.
This table also includes any suggested mutant phenotypes from
other previous experimental work, retrieved from the PhenoPlasm
database. Such data available for six genes, and all of these were pre-
dicted as essential. In three cases the genes had proven refractory to
deletion in previous experimental techniques, but in the remaining
cases there were reports of successful deletion.
This limited experimental data shows that this approach has a sig-
nificant number of false predictions. But I would hope, based on the
profound differences seen in the distribution, that a larger screen of
these predicted genes would demonstrate this approach to have at
least some predictive power. As more data arrives, from future barseq
experiments and more conventional approaches, these issues will be-
come clearer.
Regardless of whether it is useful for prediction, we have clearly
shown than in this subset of the invadome there is correlation of
phenotypes between putatively interacting genes.
162 new approaches and tools for large scale phenotypic analysis IN SILIC O
Table 11: Mutant phenotypes for invasion genes predicted based on their essential-
ity ratios.
Gene Product Prediction Experimental
PBANKA_010990 cons. Plasmodium protein, u. f. Essential -
PBANKA_052030 cons. Plasmodium protein, u. f. Redundant -
PBANKA_060060 NIMA related kinase 3, putative
(NEK3)
Essential Essential
PBANKA_060500 rhodanese like protein, putative Redundant -
PBANKA_070180 cons. Plasmodium protein, u. f. Essential -
PBANKA_071440 cons. Plasmodium protein, u. f. Essential -
PBANKA_080940 P1 nuclease, putative Essential -
PBANKA_081900 secreted acid phosphatase,
putative,glideosome-associated
protein 50, putative (GAP50)
Essential Essential
PBANKA_082810 fumarate hydratase, putative Essential Essential
PBANKA_083270 adenylate kinase-like protein 2,
putative (AKLP2)
Redundant -
PBANKA_090580 DnaJ protein, putative Essential Redundant
PBANKA_092700 tyrosine kinase-like protein, puta-
tive (TKL2)
Essential Redundant
PBANKA_100660 cons. Plasmodium protein, u. f. Essential -
PBANKA_123300 SWIB/MDM2 domain-
containing protein, putative
Essential -
PBANKA_124010 cons. Plasmodium protein, u. f. Redundant -
PBANKA_131330 cons. Plasmodium protein, u. f. Redundant -
PBANKA_132370 FYVE and coiled-coil domain-
containing protein, putative
(FCP)
Redundant -
PBANKA_133170 zinc finger protein, putative Redundant -
PBANKA_134070 mitochondrial fission 1 protein,
putative (FIS1)
Essential -
PBANKA_140640 tripartite motif protein, putative Essential -
PBANKA_143900 cons. Plasmodium protein, u. f. Essential -
PBANKA_144330 merozoite surface protein 9, puta-
tive
Essential Redundant
PBANKA_144950 cons. Plasmodium protein, u. f. Essential -
PBANKA_146310 cons. Plasmodium protein, u. f. Redundant
5.3 large scale analysis of mutant phenotypes 163
5.3.2 Investigating the connection between a regulator’s phenotype and
those of its targets
Among the invasion network are four ApiAP2 genes, putative tran-
scription factors in Plasmodium. For one of these, AP2-O, in Chapter 3
I detected a viable knock-out in the blood stage. This viability has
already been well described but I found that the mutants appeared
to grow with reduced fitness in the blood stage. As I discussed there,
this is a controversial result given the primary role of this gene is in
the ookinete. I wanted to investigate these dynamics further at scale
and see whether a causative downstream gene could be identified.
In a simplistic model in which each AP2 gene acts alone to activate
downstream genes in an all-or-nothing fashion, it would logically fol-
low that it must be possible to knock out every gene targeted by AP2-
O to generate viable parasites, since the effect of an AP2-O deletion
would be to effectively knock-out each of these genes.
My first attempt to test this idea was to download the targets of
AP2-O, suggested recently on the basis of ChIPSeq [103]. When I
analysed those genes in this set for which I had barseq-phenotyping
data, there was a trend towards increased numbers of viable knock-
outs in the targets of AP2-O compared to the background but the
effect was not statistically significant.
However, the model given above is overly simplistic. In fact some
genes regulated by AP2-O might have their expression only subtly al-
tered by its binding, and what effect this binding has on transcription
may depend on other transcription factors and on the regulatory ele-
ments of the gene in question some of these genes might be activated
by AP2-O binding.
I therefore chose to consider data which recorded the actual ef-
fect of the knock out on transcription. I analysed ookinete microarray
data from [103]. This comprised microarrays from 7 AP2-O knockout
ookinete cultures and from 5 wild-type controls. I used the Gene Ex-
pression Omnibus to calculate the log fold change for each probe, and
then performed further analysis in R. Where multiple probes covered
the same gene the median of their log fold-changes were taken. These
values were compared to the mutant fitness (in the blood stage) of the
knock-out of the individual gene by barseq. This allowed me to ex-
164 new approaches and tools for large scale phenotypic analysis IN SILIC O
amine mutant fitnesses in the context of the degree of knockdown of
their gene in the AP2-O knock-out (see top panel of Fig. 74).
In this analysis it stood out that none of the genes strongly affected
by AP2-O knock down were essential in the blood stage, which is the
result one would predict from the AP2-O mutant’s viable phenotype.
I decided to extend this analysis to include targetability data available
from RMGMdb also (see middle panel) which makes the same case.
Unknown
PKB
Figure 74: Degree of knock-down in AP2-O knockout (as published in [103]) com-
pared to the same gene’s individual mutant phenotype
All severely knocked-down genes are non-essential, consistent
with the fact that the AP2-O mutant is viable in the blood stage.
Protein kinase B is an essential gene which shows two-fold down-
regulation in an AP2-O knockout and so may have a role in ex-
plaining the apparent slow growth of these parasites.
This argument may seem self-fulfilling one could argue that we
already knew these genes were non-essential when we saw their tran-
script level reduced in the knock-out parasites. But this analysis re-
mains useful. The microarray data on the knock-out is from ookinetes,
not from the blood stages. This analysis could have proved that AP2-
O must act differently in the blood stage, compared to the ookinete,
if it was found that there are genes essential in the blood stage which
are highly activated by AP2-O in the ookinete. Since this was not
found, the idea that the targets of AP2-O in the blood stage are the
same as those in the ookinete remains a possibility (albeit unproven).
This analysis was also useful for an attempt to investigate the rea-
son for apparent attenuation in the AP2-O knockout. Had I identified
a single gene which had significantly reduced expression in the AP2-
O knockout and gave slow-growth in the blood stage when mutated,
5.4 building the infrastructure to collate reverse genetic data across species 165
this gene alone would then have been an explanation for the mecha-
nism of attenuation of my apparent slow growth in AP2-O parasites.
We did not observe such a gene. It could yet exist, in the significant
portion of genes we have yet to probe by barseq. But alternatively it
is possible that though individual knock-outs of genes whose expres-
sion is strongly governed by AP2-O have no effect in the blood stage,
the loss of all these genes at once has a measurable effect either
through epistasis effects or just the additive influence of many very
small growth defects. A final possibility is that the small effect of AP2-
O knock-out on expression of essential genes may be the root of the
effect. Protein kinase B, an essential protein known to be involved in
invasion, is two-fold down regulated in the AP2-O knock out and so
could represent the causative gene.
I think that this approach of bringing together large scale regula-
tory and fitness datasets will add to our understanding of both com-
ponents. For example we might be able to infer the likely blood-stage
non-essentiality for the other genes heavily downregulated in the
AP2-O knock-out for which we have no phenotyping data at present
(see bottom panel of Fig. 74).
5.4 building the infrastructure to collate reverse ge-
netic data across species
In the introduction I cited an estimate from [48] that approximately
500 genes have been successfully targeted in malaria parasites. This
is an estimate because there is no database that compiles attempts to
knock out Plasmodium genes, other than the excellent Rodent Malaria
genetically modified Parasites database (RMgmDB) [105] which applies
only to the rodent parasites. The result is that when researchers be-
gin to examine a large set of genes, as I wanted to do for the in-
vadome, they must trawl through the literature to uncover attempts
to disrupt these genes. If a minute is spent researching each gene
of the invadome this adds up to 7 hours of intensive research, and
even then only one species has been investigated. There was a clear
need for a database that allowed an instant view of the transfection
attempts carried out for a large set of genes, in any of the Plasmodium
166 new approaches and tools for large scale phenotypic analysis IN SILIC O
Figure 75: PhenoPlasm database schema
The database is designed to allow painless migration to new gene
models in subsequent genome releases
genetic systems: P. falciparum, P. berghei, P. chaubaudi, P. yoelii and P.
knowlesi.
Additionally, since the ApiLoc database ceased to be curated there
is no up to date database of localisations for Plasmodium genes.
Therefore, before embarking on a comprehensive search of the in-
vasion literature for previously described mutant phenotypes, I built
a database in which this information could be deposited in a machine-
readable form, so that my data can be accessed by others and they in
turn can if they wish contribute such data for other genes.
5.4.1 Technology
I used MySQL to construct an online database to hold both phenotyp-
ing and localisation data, which I call PhenoPlasm (http://phenoplasm.org).
An empty database of genes and their orthological relationships was
set up on the basis of data from PlasmoDB. The structure of the tables
is as shown in Fig. 75.
I designed the system to be backwards compatible with the for-
malised description of phenotypes established by RMGMdb. This
meant that I could initially populate it with the data already avail-
able in that database.
5.4 building the infrastructure to collate reverse genetic data across species 167
Figure 76: The PhenoPlasm database allows queries of all experimental genetic data
on a gene to be assessed quickly
This picture depicts a search for some members of the invadome.
This is a search for P. falciparum genes and so opaque icons rep-
resent P. falciparum phenotypes. Semi-transparent icons represent
the phenotypes of 1:1 orthologues in another species, in these
cases P. berghei. In this case we see that a DnaJ protein was refrac-
tory to deletion in P. falciparum but targetable in P. berghei with no
detectable phenotype in the blood stage.
A PHP/XHTML front-end was then built in the Bootstrap frame-
work to present this data in an easily comprehensible way. Screen-
shots of the user interface, which has been designed to show only
specific types of data in a minimalist format, are shown in Figs. 76
78.
5.4.2 Key features
There are a number of key features that make this database different
to previous endeavours.
orthology The database is constructed such that searches for genes return
both phenotypes in the species requested, and phenotypes in
1:1 orthologous genes from any other species. It is the only tool
at present which allows a potentially comprehensive view of
reverse genetic data for a set of genes across Plasmodium. This
allows even researchers of the genetically intractable P. vivax
to enter gene IDs from this species and receive the results of
experiments in P. falciparum, P. berghei and P. knowlesi.
168 new approaches and tools for large scale phenotypic analysis IN SILIC O
Figure 77: Single-gene view on PhenoPlasm
This allows the source of a claim to be checked for citations and
verification.
Figure 78: Interface for data addition to PhenoPlasm
5.4 building the infrastructure to collate reverse genetic data across species 169
speed In the world of -omics there are many occasions where pheno-
types for a large set of genes will need to be queried. RMGMdb
becomes unwieldy when more than ~500 genes are queried, tak-
ing minutes to load and sometimes timing out. The database
structure of PhenoPlasm is organised such that even phenotyp-
ing data for the ~5,000 genes in an entire genome takes less than
1.5 seconds to return.
crowd-sourcing To facilitate the capture of large amounts of data the site allows
any user to instantaneously add phenotyping data for a mutant.
The wisdom or otherwise of such an approach will only become
apparent with time, but I hope the ease with which mutants can
be added will help the database to keep up to date.
referencing Like an encyclopedia, the database does not aim to be a pri-
mary datasource but a collation of references to others. This
means that data such as blot images is not currently included,
as it is on RMGMdb. It is expected that this data will be pre-
sented in a publication and referenced there. This is a necessary
consequence of a non-curatorial approach. Every phenotype is
referenced with a PubMed ID or other hyperlink where this is
unavailable (e.g. conference abstracts).
5.4.3 A systematic review of experimental genetics in Plasmodium inva-
sion
The absence of semantic digital identifiers for genes referred to in
papers makes finding every reference to attempts to mutate a gene
potentially challenging. I constructed a feature within PhenoPlasm to
search Google Scholar (which contains full-text from many journals)
for both the gene ID of a particular gene, and also every previous
iteration of that name (PFDXXXXC, MAL13P1.XXX, etc.)
Using this feature I searched for references to each of the 418 genes
listed in the Hu et al. invadome. Where a gene also had a common
name I additionally searched for this name to include publications
not providing a full gene ID.
Of the 418 genes in the invasion-network list, for 74 were reported
knock-out attempts were found. Exactly 50% of these were successful,
with the remainder refractory to deletion. There was no data in P.
170 new approaches and tools for large scale phenotypic analysis IN SILIC O
falciparum for the remaining 82% of genes, but some of these had
data available from orthologues imported from RMGMdb.
I also examined the immunofluorescence and immuno-EM images
available at Malaria Metabolic Pathways to find localisations for this
list of proteins and where I could see a clear organellar localisation I
added this information to the database.
5.4.4 Database statistics
I have since supplemented the database with additional knock-out
attempts reported in the literature and at scientific conferences. As of
September 2015, PhenoPlasm contains records of 164 genes for which
attempts to disrupt have been reported in P. falciparum, all manually
curated from the literature into a systematic machine-readable form.
(78 of these were successfully disrupted, 86 are described as refractory
to deletion). I believe that this includes the vast majority of genes tar-
geted to date. There are an additional 457 genes for which phenotypes
have been reported when data from P. berghei (provided by RMGMdb)
and P. knowlesi is included.
5.5 applying phenoplasm data to invasion
The PhenoPlasm database has applications to every area of parasite
genetics, but we can illustrate some of these with the example of inva-
sion. The core invadome presented already with barcode sequencing
data, can be supplemented by PhenoPlasm data generated from pre-
vious single transfection experiments as seen in Fig. 79. This is useful
to fill in parts of the network which are not covered by pJazz vectors
at present. A caveat is that where such data comes from P. falciparum
we can be less confident that a gene refractory to deletion is in fact
essential, given the greater difficulty in achieving genetic manipula-
tions in this organism.
One approach to this challenge is to only include data points from
P. falciparum that indicate a successful knock-out has occurred. I used
this approach to merge barseq data with the phenotypes gathered
from the literature in PhenoPlasm and repeated some of the analyses
performed earlier in the chapter to see if this additional data could
5.5 applying phenoplasm data to invasion 171
Source of phenotype
Barseq (Pb)
Curated literature on PhenoPlasm
(includes Pf and Pb)
Mutant phenotype
Redundant
Attenuated
Essential
Figure 79: Supplementing invadome barseq data with PhenoPlasm database
phenotypes
increase the power of the analysis. I found the same direction of corre-
lation in linked nodes (Fig. S3) but with this dataset the overall effect
was not statistically significant by χ
2
test. However an analysis for
the difference in essentiality-ratio between essential and redundant
genes gave a result of p 0.04, again illustrating the increased power
of this analysis method. Nevertheless the effect was noticeably less
strong compared to the dataset with barseq data alone.
While this could simply represent regression to the mean, i.e. indi-
cating that the effects we saw in barseq data were pure fluctuations,
the p-value of p 1.871 ˆ 10
´5
seen in the barseq data suggests to
me that this is not the case. Rather I think the reduced power in the
subsequent analysis illustrates the difficulties in pooling data from
disparate laboratories many factors such as the selection of these
172 new approaches and tools for large scale phenotypic analysis IN SILIC O
genes for study, and publication bias in favour of successful mutants
may be playing a role in confounding results. These factors make
hypothesis-free screening approaches particularly important. As such
efforts continue on invasion genes it will be important to revisit these
analyses and confirm how well the effects I observed are reproduced.
5.5.1 Non-conserved invasion ligands (and others)
In this thesis I have primarily focused on the core invadome because
it can be investigated using P. berghei and P. knowlesi. Of course the
genes within this core set must interact with rapidly evolving parasite
proteins at the interface with the host in order for invasion to occur. A
complete understanding of invasion requires understanding the role
of these proteins as well. The data so far available for mutants in these
additional genes is shown in Table 12. (This shows the already well-
established phenomenon [207] that the majority of these genes are
dispensable, with the notable exception of the complex comprising
Rh5, CyRPA and RIPR.)
The three CDPKs may seem an odd inclusion in this list: their se-
quences are so similar that they are placed in the same group by Or-
thoMCL and therefore not considered 1:1 orthologs between species.
One possible improvement to the methodology would be to use syn-
teny to assist in the calling of 1:1 orthologs, which I will consider for
future work.
5.6 discussion
5.6.1 Towards a holistic view of invasion
Earlier chapters have introduced two systems by which we may be
able to obtain phenotyping data for large numbers of genes using
high-throughput approaches. In this chapter I have developed a tool
that enables any researcher to retrieve this data, regardless of what
species of malaria they work on, in a matter of seconds. I have also
outlined how new approaches to analysing this data, leveraging in-
formation about protein-protein interactions or transcriptomics, may
allow additional insights to be drawn into parasite biology.
5.6 discussion 173
Table 12: Collated mutant phenotypes for Hu et al. invasion genes which
were not included in the 1:1:1 core invadome so far discussed. For
references please see the relevant page on PhenoPlasm.
Gene ID Description Mutant phenotype
PF3D7_0217500 calcium-dependent protein kinase 1 (CDPK1) Refractory
PF3D7_0717500 calcium-dependent protein kinase 4 (CDPK4) Viable
PF3D7_1337800 calcium-dependent protein kinase 5 (CDPK5) Refractory
PF3D7_0423800 cysteine-rich protective antigen (CyRPA) Refractory
PF3D7_0302500 cytoadherence linked asexual protein 3.1 (CLAG3.1) Viable
PF3D7_0302200 cytoadherence linked asexual protein 3.2 (CLAG3.2) Viable
PF3D7_0935800 cytoadherence linked asexual protein 9 (CLAG9) Viable
PF3D7_1035700 duffy binding-like merozoite surface protein (DBLMSP) Viable
PF3D7_1301600 erythrocyte binding antigen-140 (EBA140) Viable
PF3D7_0424300 erythrocyte binding antigen-165, pseudogene (EBA165) Viable
PF3D7_0731500 erythrocyte binding antigen-175 (EBA175) Viable
PF3D7_0102500 erythrocyte binding antigen-181 (EBA181) Viable
PF3D7_1035300 glutamate-rich protein (GLURP) Viable
PF3D7_1035600 merozoite surface protein (H101) Viable
PF3D7_1036000 merozoite surface protein (MSP11) Viable
PF3D7_1035400 merozoite surface protein 3 (MSP3) Viable
PF3D7_0206900 merozoite surface protein 5 (MSP5) Viable
PF3D7_1035500 merozoite surface protein 6 (MSP6) Viable
PF3D7_1335100 merozoite surface protein 7 (MSP7) Viable
PF3D7_0316000 microneme associated antigen (MA) Refractory
PF3D7_1476300 Plasmodium exported protein (PHISTb), unknown function Viable
PF3D7_0202100 Plasmodium exported protein (PHISTc), unknown func-
tion,liver stage associated protein 2 (LSAP2)
Viable
PF3D7_1335400 reticulocyte binding protein 2 homologue a (RH2a) Viable
PF3D7_1335300 reticulocyte binding protein 2 homologue b (RH2b) Viable
PF3D7_0402300 reticulocyte binding protein homologue 1,normocyte bind-
ing protein 1 (RH1)
Viable
PF3D7_1252400 reticulocyte binding protein homologue 3, pseudogene
(RH3)
Viable
PF3D7_0424200 reticulocyte binding protein homologue 4 (RH4) Viable
PF3D7_0424100 reticulocyte binding protein homologue 5 (RH5) Refractory
PF3D7_0323400 Rh5 interacting protein (RIPR) Refractory
PF3D7_0501500 rhoptry-associated protein 3 (RAP3) Viable
PF3D7_0208000 serine repeat antigen 1 (SERA1) Viable
PF3D7_0207800 serine repeat antigen 3 (SERA3) Viable
PF3D7_0207500 serine repeat antigen 6 (SERA6) Refractory
PF3D7_0802100 transcription factor with AP2 domain(s) (ApiAP2) Refractory
PF3D7_1235200 V-type K+-independent H+-translocating inorganic py-
rophosphatase (VP2)
Viable
174 new approaches and tools for large scale phenotypic analysis IN SILIC O
When the barseq results described in Chapter 3 are added to Pheno-
Plasm, a search for the 418 genes of the P. falciparum invadome will
return 214 results. More than 50% of these result from my barcode
sequencing experiments, the remainder are derived from my survey
of the P. falciparum literature and those imported from the RMGMdb
database. This collection of phenotypes for what may be the majority
of proteins involved in invasion represents an important resource for
researchers studying this crucial part of the parasite lifecycle.
Already in this chapter we have seen the applications that such a
dataset can provide. Based on the localisations recorded in Pheno-
Plasm, and the PlasmoINT network, I was able to suggest that a pro-
tein for which immunofluorescence analysis gave an unclear apical lo-
calisation is very likely to be localised to the rhoptries. As additional
protein-protein interaction resources are developed by tools such as
BioID, and integrated into interaction networks, these approaches are
likely to become more and more powerful.
Having observed an AP2 gene giving an unexpected apparent blood-
stage phenotype, by integrating phenotype data with transcriptomic
data generated (by others) from parasites lacking this gene, I was able
to suggest a possible candidate downstream gene responsible for this
phenotype.
By analysing the substructure within phenotyped genes I was able
to observe some limited evidence for the phenomenon of centrality-
lethality in Plasmodium invasion. This type of analysis also enabled
the observation that a gene’s mutant phenotype is correlated with
that of those which interact with it. This enabled attempts to pre-
dict the phenotypes of genes as yet uncharacterised experimentally,
though the accuracy of these predictions remains unclear.
The range of approaches described above is just a hint of the possi-
bilities that large-scale phenotypic data can provide when made eas-
ily available to the wider research community. I hope that Pheno-
Plasm, and some of the approaches to analysing its data discussed in
this chapter will contribute to the realisation of some of these possi-
bilities.
5.6 discussion 175
5.6.2 Areas for follow-up
5.6.2.1 Protein-protein interaction network
Much of the analysis in this chapter, and this project, relies on an in-
teraction network which is now five years old. Clearly the quality of
interactions available has major implications for what can be inferred
in these analyses. In the coming years I anticipate a number of pub-
lications using new approaches (such as protein tagging with biotin
ligases or peroxidases), as well as more conventional immunoprecipi-
tation, to enable large numbers of new, high quality, interactions to be
established. These may not be sufficient to generate a genome-wide
interactome but properly integrated into the other data sources used
by Hu et. al they will provide heavily weighted known interactions
that improve the network as a whole.
There are also a number of new larger scale datasets that should be
added to such an analysis. Co-expression in AP2 knockouts (such as
the one discussed earlier in this chapter) may indicate an increased
likelihood of interaction, as should shared presence in the proteome
of a certain parasite stage.
These must all be integrated in the next generation of Plasmodium
interaction networks, and one important result will be an increase in
the proportion of the genome covered in high confidence predictions
above the two-thirds coverage that exists today.
Ultimately such a dataset may produce a network diagram in which
clusters of genes are very well separated by parasite stage, and when
phenotyping information is overlaid on this the sexual and mosquito
stages will light up with the green of blood-stage redundant genes.
5.6.2.2 Improvements to PhenoPlasm
There are a number of additional features that could be added to
PhenoPlasm. The database already collects information on the type of
experimental approach taken to establish a phenotype, but does not
yet allow searching based on this. This will become important as more
and more high-throughput approaches are developed a user may
want to retrieve only genes where cloned lines have been produced
carrying a knock out, or for other purposes they might be prepared to
consider a gene redundant if dCas9 targeted to its genomic location
176 new approaches and tools for large scale phenotypic analysis IN SILIC O
was tolerated by the parasite. Collecting all this data and then making
it appropriately searchable will be essential.
Part III
D I S C U S S I O N
6
G E N E R A L D I S C U S S I O N
6.1 achievements
6.1.1 New phenotyping data for putative invasion genes may inform target
selection and models of the invasion process
Invasion genes are key targets for vaccine development, and may also
represent good drug targets. In Chapter 3, 71 likely essential putative
invasion-related genes were identified. 54 of these were being inves-
tigated for the first time in any Plasmodium species, and each one is
now part of a very long list of potential drug/vaccine targets. These
data represent a very small first piece in a long checklist of require-
ments for a drug target, but is nonetheless nonetheless an essential
step.
In addition to establishing new potential targets, my data also rules
out some genes which might previously have been seen as candidates.
As a result of the open-ended screening in this study we overturned
phenotypes for 3 genes previously considered essential, including
ROM4 which is one of a class of rhomboid proteases considered
possible therapeutic targets in Apicomplexa. [23] The use of efficient
PlasmoGEM vectors, first in barseq experiments and then in single
transfection follow-up demonstrated knock-outs for these genes to
be viable (but in some cases attenuated).
Another key role for the identification of phenotypes for uncharac-
terised genes is to limit the search space in investigations which try
to find the gene that fills a particular role in invasion. As a theoretical
example, the identify of the Plasmodium glideosome adhesin connec-
tor is not yet definitively established. One approach to identify this
would be to pull down proteins binding to parasite actin. However
such a biochemical approach would probably identify a large num-
ber of proteins, many of which would be false positives. Most models
would assume the GAC to be essential for invasion and so the incor-
poration of my phenotyping data would allow any proteins identified
179
180 general discussion
by biochemical means to be filtered to a more manageable number
genes with viable knock-outs could be excluded as candidates, and
uncharacterised genes shown by barseq to be essential could be pri-
oritised for further investigation.
6.1.2 A new model system for PlasmoGEM
After a number of experiments to optimise the system for parasite
culture, I have brought the PlasmoGEM approach which has already
made a profound impact on P. berghei genetics to a new species, P.
knowlesi. This new system has significant additional benefits. While
the P. berghei system is clearly more efficient, and so has a certain
level of genetic power that cannot yet be matched in P. knowlesi, my
adaptation of PlasmoGEM to P. knowlesi has two key advantages over
P. berghei a robust in vitro culture system and the presence of hu-
man erythrocytes. The in vitro nature of this system should simplify
tagging experiments, reduce animal usage and enable more detailed
phenotyping experiments (such as discriminating between genes in-
volved in egress and invasion), and the presence of human erythro-
cytes clearly will make some results more clinically relevant to other
human malaria parasites. Finally, the emergence of the threat to hu-
man health posed by P. knowlesi in Southeast Asia, and an increased
appreciation of the potentially severe clinical complications of P. knowlesi
malaria, means that results obtained in this species may have direct
clinical relevance in their own right.
6.1.3 New in silico approaches and a new tool for the experimental genetics
community
Finally I have investigated the patterns within my data at scale. Such
analyses are made difficult by the uncharacterised nature of these pro-
teins, for which there are not existing mapped pathways. However
my use of previous bioinformatically inferred protein-protein interac-
tion has revealed patterns within phenotyping data, and allowed new
insights, such as the suggestion of possible models for the attenuation
observed in an AP2 mutant by barseq.
6.2 screening limitations 181
Finally the development of a database in which the data generated
in P. berghei and P. knowlesi can be deposited, and its supplementa-
tion with a full literature review of the experimental genetics of in-
vasion genes in P. falciparum may prove a valuable resource to other
researchers.
6.2 screening limitations
6.2.1 False negatives may occur in barseq approaches
In any conventional knock-out approach it is very difficult to distin-
ugish between a failure to obtain mutants due to gene essentiality,
and a failure due to an inability of the targeting construct to inte-
grate. These issues are evidenced by the examples of genes refractory
to deletion in previous published experiments where PlasmoGEM
vectors, with their long homology arms, have now enabled disrup-
tion including three verified examples described in Chapter 3. Even
with the efficiency of PlasmoGEM technology a non-zero number of
our vectors will have insufficient homology to integrate, or contain
spontaneous mutations that disrupt their ability to create resistant
parasites. Thus we cannot be certain that any gene described as es-
sential by barcode sequencing is certainly essential, but we can say
that there is a high probability that the gene will be essential or that
its knockout will result in parasites that are severely attenuated. Val-
idation with conditional technologies, which could also be scaled us-
ing PlasmoGEM and are discussed further below, will be needed to
confirm essentiality.
6.2.2 False positives are a possibility
There is also a potential for false-positive events in barcode-sequencing.
In very rare events, a locus can be spontaneously duplicated (we saw
an example of this in the parental A1.H1 P. knowlesi line in Chapter 4).
In P. berghei such an event has previously allowed an essential gene
to be duplicated so that one copy could be deleted and replaced by
a barcode. [79] The only way to exclude such a result is to follow up
with an experiment to clone out the parasite, and perform a Southern
182 general discussion
blot or qPCR. However the typical success of these follow-ups gives
considerable confidence to an initial ’redundant’ call by barseq.
One advantage of the qPCR approach which I developed as part of
my application of PlasmoGEM to P. knowlesi is that it avoids any pos-
sibility of false-positives, by directly assaying gene deletion, but this
approach cannot be used in pooled transfections. Because P. knowlesi
is an in vitro system the single knock-out approach might be scalable
even to a genome scale, especially with the application of automated
cell culture facilities.
6.2.3 On the meaning of ‘essentiality’
The difference between severe attenuation and ‘essentiality’ (which
technically perhaps means a generational growth rate ă 1) is impossi-
ble to establish by barcode-sequencing. There are two reasons for this.
Firstly essentiality is a relative term, and dependent on context. For
example, the host immune system exerts pressure on the parasite, and
a gene that appears essential in our experiments might be merely at-
tenuated in an immunodeficient mouse. A separate issue is that very
slow-growing parasites are rapidly outcompeted from the population
and may reach levels undetectable by our sequencing methodology,
preventing their measurement and leading them to be called as likely
essential. These effects are not of major concern since merely know-
ing that loss of a gene causes a severe decrease in fitness provides a
great deal of information. But this does make it more difficult to test
models predicting that a particular gene plays an indispensable role
in invasion. Conditional approaches, which I will discuss, are likely
to be the solution.
6.2.4 ..and ‘redundancy’
40% of P. berghei core invadome genes profiled were identified by
barseq as redundant with no statistical evidence of attenuation. It is
tempting to think that these genes are of little importance. However it
is worth considering that estimates put the divergence time between
P. falciparum and its closest relative, P. reichenowi at 5-7 million years
ago. The divergence time between the murine malarias is estimated
6.2 screening limitations 183
at more than 24 million years ago. [175] The split therefore between
P. falciparum and the phylogenetic branch containing P. berghei and P.
knowlesi likely occurred much earlier in evolutionary history. Every
gene in the core invadome has been conserved, with 1:1 orthology,
throughout these many millions of years and this provides strong ev-
idence that any parasites which lost them were at a significant fitness
disadvantage in at least some circumstances.
6.2.4.1 Genetic interactions
In some cases the appearance of redundancy will be because the func-
tion of the gene in question gives only a moderate growth advantage -
slightly increasing adhesion to the erythrocyte, for example. In these
cases many such genes would have to be knocked out to create an
additive effect which abolished erythrocyte binding and became es-
sential.
In other cases it may be that the gene is involved in an alternative
pathway, necessitated only in certain conditions (for example the par-
asite encountering a host which has built up an immune response
to one invasion pathway). This would mean that a single knock-out
would have wild-type fitness but that the additional knock-out of a
second ‘gene’ might prevent invasion entirely.
Ultimately these factors mean that to fully dissect the genetics of
parasite machinery it will be necessary to invesigate synthetic lethal-
ity, and synergistic genetic interactions more broadly.
6.2.4.2 Understanding the interplay between invasion and the immune sys-
tem
Invasion is a unique point in the intraerythrocytic cycle where the
parasite is outside a red blood cell and directly accessible to the im-
mune system. This is a powerful selective force which has doubtless
had profound effects on parasite evolution.
It is likely that many merozoite proteins may have functionality
that becomes apparent only in the presence of a functional immune
system. In the P. knowlesi in vitro system, there is of course no immune
system, and complement present in the serum is destroyed by heat-
inactivation, so any such effects would not be observed.
184 general discussion
While the Theiler’s Original mice used in my P. berghei ANKA ex-
periments all possessed an immune system, they are unable to control
P. berghei infections and would die if infections were allowed to con-
tinue. This is unlikely to reflect the natural African setting for the
parasite where chronic infections are presumably favoured for trans-
mission. The clear failure of the mouse to clear the parasite may be
indicative of a limitation of even this model to identify genetic effects
which result from interactions with the immune system.
In addition some aspects of immune evasion would not be ex-
pected to give cell-autonomous genetic effects (e.g. those involving
immunomodulation). If a mutant’s defect can be rescued by the pres-
ence of parasites of a different genotype in a co-infection then this
defect will not be visible in a barcode-sequencing approach.
6.2.5 Barseq in P. berghei fails to directly assay invasion
Balu et al. [15] wrote that:
’Malaria parasite growth in the intraerythrocytic stages is
determined by multiple factors: the length of the asexual
cell cycle, the number of merozoites produced per sch-
izont, the efficiency of merozoite release from host cells,
and the invasion efficiency of new host cells by the newly
formed merozoites.
Their argument was that measuring growth rates allows access to
these many different processes. But this also means that if one mea-
sures an attenuated growth rate, even for a mutant in a gene expected
to be involved in invasion, one cannot be sure it is the result in a de-
fect in invasion.
Other than a single immunofluorescence experiment, and sanity-
checking bioinformatics, I have done very little in this thesis to verify
that the genes with which I have been working actually have a role
in invasion. The data such as they are are still very helpful, but the
future development of invasion-specific assays will be important.
Expanding these studies into in vitro systems, such as was the focus
of the P. knowlesi work, will be necessary to make such stage-specific
functional inferences.
6.3 extensions of work in p. berghei 185
6.2.6 Crossing species barriers
In this thesis I have worked across three species of malaria parasite,
and been able to exploit the strengths of each the wealth of bioin-
formatic data available for P. falciparum, the robust genetic system
available in P. berghei and the potential for high efficiency in vitro
transfection in P. knowlesi. There are clear advantages to such an ap-
proach, but caution must also be exercised.
Ultimately all research into malaria is undertaken because of the
possibility that the results generated may someday have a role to
play in managing or decreasing the hundreds of thousands of deaths
caused by the parasite each year. The vast majority of these deaths are
caused by P. falciparum (though the morbidity due to P. vivax should
not be neglected).
Therefore where model systems are used it is important that dis-
coveries made will be of relevance to human parasites. I believe that
our approach of finding 1:1:1 orthologues will mean that for the most
part results are applicable to P. falciparum. But we do have examples,
such as the palmitoyl-transferase DHHC3, of proteins with 1:1:1 or-
thology but whose localisation, and likely hence also function, differs
between the species. These possibility of these genetic ’false-friends’
must be considered when inferences are made across species.
6.3 extensions of work in p. berghei
There are a number of ways in which I believe the P. berghei invadome
experiments could be extended.
A key limitation of time-point based barseq is that it gives an indi-
cation only of mutant fitness, but not of the part of parasite biology
in which the gene is involved. However with novel approaches there
may be the potential to gain a more detailed view of the biological
mechanisms underlying invasion. Even in P. berghei, in which inva-
sion is often thought inaccessible there may be some potential for
this.
186 general discussion
6.3.1 Assaying P. berghei invasion in vitro
While P. berghei is primarily an in vivo system it may be possible to
conduct assays of invasion with an in vitro approach. In this system
a pooled barcode transfection of the invadome would be carried out
and grown up in a mouse. At a late stage of infection the mouse
would be bled and ring parasites cultured to schizont stage, some
of which would be analysed by barseq. Mechanical disruption could
then be used to release merozoites, and their invasive capacity as-
sayed by counting barcodes from the fresh rings formed after inva-
sion. It appears that the use of an orbital shaker to induce disruption
of the schizonts [96] might achieve relatively efficient invasion into
reticulocytes with minimal mortality.
6.3.2 Comparative assay of P. berghei invasion in vivo
An alternative approach would be to investigate the connection be-
tween parasite genes and erythrocyte receptors by performing com-
parative invasion assays, which could potentially be performed in
vivo. In such an approach first a pooled transfection of the entire
invadome would be carried out as already described. A sample of
the blood would then be taken on, say day 6 post-invasion. This
would contain ring parasites, and erythrocytes could be labelled with
fluorescent-conjugated antibodies, for example with anti-CD71 ([137])
which would identify populations of CD71 reticulocytes from other
red blood cells. The two erythrocyte populations could then be sepa-
rated by fluorescence-activated cell sorting (FACS). After this process
DNA could be extracted from the two subpopulations and barseq
performed.
Mutants differing significantly in abundance between the two pop-
ulations would likely be indicative of genes permitting invasion into
the two cell-types. P. berghei shows a significant reticulocyte prefer-
ence and so one might expect some genes to be identified, potentially
with important implications for reticulocyte-restricted P. vivax [134].
Extending this in vivo approach, we need not necessarily limit our-
selves to testing the repertoire of cells existing within a mouse a
further possibility would be to treat P. berghei erythrocytes in vitro
6.4 extensions of work in p. knowlesi 187
with enzymes such as trypsin, chymotrypsin and neuraminidase in
order to deplete certain surface receptors, then to label these various
populations with fluorescent dyes ([190]). These different target blood
cells could be pooled and injected into a mouse midway through an
invadome barseq transfection. A few hours later, to allow reinvasion,
blood could be taken from this mouse, sorted by FACS to separate
the labelled cells and DNA extracted and barcodes counted. In theory
this might allow an in vivo invasion assay identifying genes involved
in binding to certain host cell receptors.
6.4 extensions of work in p. knowlesi
6.4.1 Barcode-sequencing with technology developed to date
As I discussed at the end of Chapter 4, we are already in a position
to begin barseq experiments in P. knowlesi. However with existing
technology these will be more complex than those in P. berghei due
to the need to perform individual transfections and culture for each
vector, before pooling.
The power of this approach will depend upon the level of attenua-
tion of mutants that can be recovered with PlasmoGEM transfections
in P. knowlesi. Testing this will require further experiments. In the
event that it is discovered that only the fittest of knock-outs can be
recovered, it may be important to first investigate ways of enhancing
the efficiency of genetic modification, as I will discuss later.
Effective barseq established in vitro would enable a wealth of ex-
periments because of the easy accessibility of invasion. Barseq exper-
iments could be used to identify mutants specifically depleted after
invasion into enzyme treated erythrocytes, erythrocytes of different
ages and even different species (Homo sapiens, Macaca fascicularis and
Macaca nemestrina).
6.4.2 Potential for large scale endogenous tagging
With the genetic modification technology developed in Chapter 4, it is
already clear that tagging transfections in P. knowlesi are highly effec-
tive. In every tagging experiment the vast majority of the transfected
188 general discussion
population carried the correct modification. There are two major ways
in which I can imagine tagging contributing to our understanding of
invasion. The first is localisation invasion involves a coordinated se-
quence of events involving a series of organelles, from the merozoite
surface to the micronemes to the rhoptries to the inner membrane
complex. Understanding where a protein is localised provides a great
deal of information about its potential role in invasion and so high-
througput tagging is likely to be an important component of better
understanding invasion at scale. This could take place either exactly
as we carried it out in this study with an epitope tag or using
fluorescent protein fusions which might permit analysis in live cells.
6.4.3 Conditional analysis to allow the investigation of essential genes
A very different application of tagging would be for establishing the
functional importance of genes with a conditional mutagenesis ap-
proach. Given the observation that episomes can be established in P.
knowlesi, there is a potential concern of parasite harbouring episomes
outcompeting knock-outs and so causing barcode-sequencing to give
false-positive results in a pooled transfection. This concern could be
mitigated with a conditional approach. In such a system the integra-
tion of a barcode would not knock out a gene but tag it with a se-
quence allowing conditional knock-down such as the destabilisation-
domain (DD). [12] Such parasites would be expected to have neg-
ligible fitness defects in permissive conditions (in the presence of
the compound Shld-1) and so to outcompete episomes (due to seg-
regational instability). Thus a mixed population could be established
and barcodes counted in these permissive conditions. Shld-1 could
then be removed and barcodes counted after a round of invasion.
Any difference in barcode abundance between permissive and non-
permissive conditions would necessarily indicate a fitness defect due
to the knockdown. It is possible that this strategy, or a similar ap-
proach with one of the other Plasmodium conditional systems [48]
would also help to separate a role in invasion from a more general
role in metabolism by allowing induction only at the late schizont
stage.
extensions of work in P. KN OWLESI 189
6.4.4 Approaches to improve the rate of DNA integration
The observation that P. knowlesi is capable of carrying a resistance
marker from a transfection without necessarily undergoing the in-
tended ends-out integration event suggests that the robustness of
transfection could be increased by selecting against such parasites.
6.4.4.1 Negative selection marker
One possible means of achieving this is to transfect with vectors, ei-
ther circular or linear, which carry a marker outside the homology
region that allows parasites carrying it to be selected against. This
would select against circular episomes and therefore (in combination
with positive selection) in favour of ends-out integration events. The
thymidine kinase negative selection marker has already been shown
to function in P. knowlesi [199] and so would be a sensible candidate
for such an approach.
6.4.4.2 Crispr/Cas9
A Crispr/Cas9 approach may kill two birds with one stone by pro-
viding negative selection while also enhancing recombination into
the target locus.
Provided there is a good efficiency of cutting (which my in silico
analysis sugggested should be achievable for the vast majority of the
P. knowlesi genome), Crispr/Cas9 will provide a constant supply of
double strand breaks in unmodified loci. In theory given the lack
of non-homologous end-joining in this system these will be resolved
either by the correct integration of a targeting construct or by the
death of the parasite.
Thus selection for a cassette encoding the Cas9 protein and a guide
RNA may substantially boost transfection efficiencies as the presence
of double-strand breaks recruits DNA repair machinery to the target
site. My experiments to date have suggested a more modest improve-
ment, but trouble-shooting is needed to investigate whether this is
simply due to technical hurdles.
If Crispr approaches were highly successful there might be a value
in encoding a guide RNA from directly within each PlasmoGEM vec-
tor. This could be achieved by using a long oligo during the recombi-
190 references
neering process containing the guide RNA sequence driven by a T7
promoter. Such an approach might make a pooled barseq experiment
possible in P. knowlesi by recreating the condition present in P. berghei
that the presence of a barcode definitively indicates the knocking out
of a gene in the parasite.
6.5 final thoughts
2015 is an exciting time to be studying invasion in Plasmodium. It is
clear just how much there is still to be understood about this crucial
event. Simple probability means that some of the vast number of un-
characterised Plasmodium genes must be involved in this process, in
ways we do not yet understand.
The high-throughput tools now becoming available should allow
the extension of my results to identify not only the importance of
these genes but to prove that their role is in invasion. High-throughput
localisation should even be able to indicate at what stage of the intri-
cate process they play their role.
As our understanding of invasion is filled in with increased resolu-
tion, the power of large scale bioinformatic analysis based on protein-
protein interaction data can be expected to multiply the effect of each
discovery by pulling previously uncharacterised genes into increas-
ingly sophisticated models. These approaches will have to bring to-
gether data on a protein’s interactions on a molecular scale, on its
organellar localisation, and on its the phenotypes of parasite created
when its gene is disrupted in terms of the classes of cell that can be
invaded and their relative efficiencies.
The technology for generating each of these classes of data at scale
is just starting to be employed in laboratories around the world. In
the coming years, we can hope that at their intersection will emerge
a much clearer understanding of the complex molecular machinery
that underlies erythrocyte invasion.
B I B L I O G R A P H Y
[1] J. H. Adams, B. K. I. M. L. E. E. Simt, S. A. Dolan, X. Fang,
D. C. Kaslow, and L. H. Miller. A family of erythrocyte binding
proteins of malaria parasites. 89(August):70857089, 1992.
[2] A. M. Ahmed, M. M. Pinheiro, P. C. Divis, A. Siner, R. Zain-
udin, I. T. Wong, C. W. Lu, S. K. Singh-Khaira, S. B.
Millar, S. Lynch, M. Willmann, B. Singh, S. Krishna, and
J. Cox-Singh. Disease progression in Plasmodium knowlesi
malaria is linked to variation in invasion gene family mem-
bers. PLoS neglected tropical diseases, 8(8):e3086, Aug. 2014.
ISSN 1935-2735. doi: 10.1371/journal.pntd.0003086. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=4133233&tool=pmcentrez&rendertype=abstract.
[3] K. A. Akinosoglou, E. S. C. Bushell, C. V. Ukegbu,
T. Schlegelmilch, J.-S. Cho, S. Redmond, K. Sala, G. K.
Christophides, and D. Vlachou. Characterization of
Plasmodium developmental transcriptomes in Anophe-
les gambiae midgut reveals novel regulators of malaria
transmission. Cellular microbiology, 17(2):25468, Mar.
2015. ISSN 1462-5822. doi: 10.1111/cmi.12363. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=4371638&tool=pmcentrez&rendertype=abstract.
[4] M. M. Alam, L. Solyakov, A. R. Bottrill, C. Flueck, F. A. Siddiqui,
S. Singh, S. Mistry, M. Viskaduraki, K. Lee, C. S. Hopp, C. E.
Chitnis, C. Doerig, R. W. Moon, J. L. Green, A. A. Holder, D. A.
Baker, and A. B. Tobin. Phosphoproteomics reveals malaria par-
asite Protein Kinase G as a signalling hub regulating egress and
invasion. Nature communications, 6:7285, Jan. 2015. ISSN 2041-
1723. doi: 10.1038/ncomms8285. URL http://www.nature.com/
ncomms/2015/150706/ncomms8285/full/ncomms8285.html.
[5] A. P. Alker, V. Mwapasa, and S. R. Meshnick. Rapid real-time
PCR genotyping of mutations associated with sulfadoxine-
pyrimethamine resistance in Plasmodium falciparum. An-
timicrobial agents and chemotherapy, 48(8):29249, Aug. 2004.
ISSN 0066-4804. doi: 10.1128/AAC.48.8.2924-2929.2004. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=478490&tool=pmcentrez&rendertype=abstract.
[6] R. J. W. Allen and K. Kirk. Plasmodium falciparum culture: the
benefits of shaking. Molecular and biochemical parasitology, 169(1):
635, Jan. 2010. ISSN 1872-9428. doi: 10.1016/j.molbiopara.2009.
09.005. URL http://www.ncbi.nlm.nih.gov/pubmed/19766147.
191
192 references
[7] R. J. W. Allen and K. Kirk. Plasmodium falciparum culture: The
benefits of shaking. Molecular and Biochemical Parasitology, 169
(1):6365, 2010. ISSN 01666851. doi: 10.1016/j.molbiopara.2009.
09.005.
[8] P. L. Alonso and M. Tanner. Public health challenges and
prospects for malaria control and elimination. Nature medicine,
19(2):1505, Feb. 2013. ISSN 1546-170X. doi: 10.1038/nm.3077.
URL http://dx.doi.org/10.1038/nm.3077.
[9] M. Anisimova and O. Gascuel. Approximate likelihood-ratio
test for branches: A fast, accurate, and powerful alternative. Sys-
tematic biology, 55(4):53952, Aug. 2006. ISSN 1063-5157. doi:
10.1080/10635150600755453. URL http://www.ncbi.nlm.nih.
gov/pubmed/16785212.
[10] F. A. Ansari, N. Kumar, M. Bala Subramanyam, M. Gnanamani,
and S. Ramachandran. MAAP: malarial adhesins and adhesin-
like proteins predictor. Proteins, 70(3):65966, Feb. 2008. ISSN
1097-0134. doi: 10.1002/prot.21568. URL http://www.ncbi.nlm.
nih.gov/pubmed/17879344.
[11] S. Antinori, L. Milazzo, and M. Corbellino. Plasmodium
knowlesi: An Overlooked Italian Discovery? Clinical infec-
tious diseases : an official publication of the Infectious Diseases
Society of America, 53(8):849; author reply 84950, Oct. 2011.
ISSN 1537-6591. doi: 10.1093/cid/cir527. URL http://cid.
oxfordjournals.org/content/53/8/849.1.full.
[12] C. M. Armstrong and D. E. Goldberg. An FKBP destabiliza-
tion domain modulates protein levels in Plasmodium falci-
parum. Nature methods, 4(12):10079, Dec. 2007. ISSN 1548-7105.
doi: 10.1038/nmeth1132. URL http://www.ncbi.nlm.nih.gov/
pubmed/17994030.
[13] D. L. Baldi, K. T. Andrews, R. F. Waller, D. S. Roos, R. F.
Howard, B. S. Crabb, and A. F. Cowman. RAP1 controls
rhoptry targeting of RAP2 in the malaria parasite Plasmod-
ium falciparum. The EMBO journal, 19(11):243543, June
2000. ISSN 0261-4189. doi: 10.1093/emboj/19.11.2435. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=212767&tool=pmcentrez&rendertype=abstract.
[14] B. Balu, C. Chauhan, S. P. Maher, D. A. Shoue, J. C.
Kissinger, M. J. Fraser, and J. H. Adams. piggyBac is
an effective tool for functional analysis of the Plasmod-
ium falciparum genome. BMC microbiology, 9:83, Jan.
2009. ISSN 1471-2180. doi: 10.1186/1471-2180-9-83. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2686711&tool=pmcentrez&rendertype=abstract.
references 193
[15] B. Balu, N. Singh, S. P. Maher, and J. H. Adams. A ge-
netic screen for attenuated growth identifies genes cru-
cial for intraerythrocytic development of Plasmodium
falciparum. PloS one, 5(10):e13282, Jan. 2010. ISSN
1932-6203. doi: 10.1371/journal.pone.0013282. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2952599&tool=pmcentrez&rendertype=abstract.
[16] D. Bargieri, V. Lagal, I. Tardieux, and R. Ménard. Host cell
invasion by apicomplexans: what do we know? Trends in para-
sitology, 28(4):1315, Apr. 2012. ISSN 1471-5007. doi: 10.1016/
j.pt.2012.01.005. URL http://www.ncbi.nlm.nih.gov/pubmed/
22326913.
[17] J. Baum, D. Richard, J. Healer, M. Rug, Z. Krnajski, T.-W.
Gilberger, J. L. Green, A. A. Holder, and A. F. Cowman. A con-
served molecular motor drives cell invasion and gliding motil-
ity across malaria life cycle stages and other apicomplexan par-
asites. The Journal of biological chemistry, 281(8):5197208, Feb.
2006. ISSN 0021-9258. doi: 10.1074/jbc.M509807200. URL
http://www.ncbi.nlm.nih.gov/pubmed/16321976.
[18] J. Baum, L. Chen, J. Healer, S. Lopaticki, M. Boyle, T. Triglia,
F. Ehlgen, S. a. Ralph, J. G. Beeson, and A. F. Cowman.
Reticulocyte-binding protein homologue 5 - an essential ad-
hesin involved in invasion of human erythrocytes by Plasmod-
ium falciparum. International journal for parasitology, 39(3):371
80, Feb. 2009. ISSN 1879-0135. doi: 10.1016/j.ijpara.2008.10.006.
URL http://www.ncbi.nlm.nih.gov/pubmed/19000690.
[19] J. Baum, A. T. Papenfuss, G. R. Mair, C. J. Janse, D. Vlachou,
A. P. Waters, A. F. Cowman, B. S. Crabb, and T. F. de Koning-
Ward. Molecular genetics and comparative genomics reveal
RNAi is not functional in malaria parasites. Nucleic Acids
Research, 37(11):37883798, Apr. 2009. ISSN 0305-1048. doi:
10.1093/nar/gkp239. URL http://nar.oxfordjournals.org/
content/37/11/3788.full.
[20] A. Berendt, G. Tumer, and C. Newbold. Cerebral malaria:
The sequestration hypothesis. Parasitology Today, 10(10):
412414, Jan. 1994. ISSN 01694758. doi: 10.1016/
0169-4758(94)90238-0. URL http://www.sciencedirect.com/
science/article/pii/0169475894902380.
[21] M. Botha, E.-R. Pesce, and G. L. Blatch. The Hsp40 proteins
of Plasmodium falciparum and other apicomplexa: regulating
chaperone power in the parasite and the host. The international
journal of biochemistry & cell biology, 39(10):1781803, Jan. 2007.
ISSN 1357-2725. doi: 10.1016/j.biocel.2007.02.011. URL http:
//www.ncbi.nlm.nih.gov/pubmed/17428722.
194 references
[22] Z. Bozdech, M. Llinás, B. L. Pulliam, E. D. Wong, J. Zhu, and
J. L. DeRisi. The transcriptome of the intraerythrocytic develop-
mental cycle of Plasmodium falciparum. PLoS biology, 1(1):E5,
Oct. 2003. ISSN 1545-7885. doi: 10.1371/journal.pbio.0000005.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=176545&tool=pmcentrez&rendertype=abstract.
[23] F. Brossier, T. J. Jewett, L. D. Sibley, and S. Urban. A spatially
localized rhomboid protease cleaves cell surface adhesins es-
sential for invasion by Toxoplasma. Proceedings of the National
Academy of Sciences of the United States of America, 102(11):4146
51, Mar. 2005. ISSN 0027-8424. doi: 10.1073/pnas.0407918102.
URL http://www.pnas.org/content/102/11/4146.abstract.
[24] G. a. Butcher. Factors affecting the in vitro culture
of Plasmodium falciparum and Plasmodium knowlesi.
Bulletin of the World Health Organization, 57 Suppl 1
(Suppl 1):1726, Jan. 1979. ISSN 0042-9686. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2395724&tool=pmcentrez&rendertype=abstract.
[25] W. Campbell. Chemotherapy of Parasitic Diseases, vol-
ume 11. Springer Science & Business Media, 1986.
ISBN 1468412337. URL https://books.google.com/books?id=
MRvaBwAAQBAJ&pgis=1.
[26] D. Camus and T. J. Hadley. A Plasmodium falciparum Antigen
That Binds to Host Erythrocytes and Merozoites. Sceince, 327
(1981):19821985, 1985.
[27] F. Caro, M. G. Miller, and J. L. DeRisi. Plate-based trans-
fection and culturing technique for genetic manipulation of
Plasmodium falciparum. Malaria journal, 11(1):22, Jan. 2012.
ISSN 1475-2875. doi: 10.1186/1475-2875-11-22. URL http:
//www.malariajournal.com/content/11/1/22.
[28] A. Carreau, B. El Hafny-Rahbi, A. Matejuk, C. Grillon, and
C. Kieda. Why is the partial oxygen pressure of human
tissues a crucial parameter? Small molecules and hypoxia.
Journal of cellular and molecular medicine, 15(6):123953, June
2011. ISSN 1582-4934. doi: 10.1111/j.1582-4934.2011.01258.
x. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=4373326&tool=pmcentrez&rendertype=abstract.
[29] R. Carter. Speculations on the origins of Plasmodium vivax
malaria. Trends in parasitology, 19(5):2149, May 2003. ISSN 1471-
4922. URL http://www.ncbi.nlm.nih.gov/pubmed/12763427.
[30] J. Castresana. Selection of conserved blocks from multiple align-
ments for their use in phylogenetic analysis. Molecular biology
and evolution, 17(4):54052, Apr. 2000. ISSN 0737-4038. URL
http://www.ncbi.nlm.nih.gov/pubmed/10742046.
references 195
[31] CDC. Perspectives: Malaria in Long-Term Travelers & Expatri-
ates - Chapter 8 - 2016 Yellow Book | Travelers’ Health | CDC,
2016. URL http://wwwnc.cdc.gov/travel/yellowbook/
2016/advising-travelers-with-specific-needs/
perspectives-malaria-in-long-term-travelers-expatriates.
[32] CDC. Long-Term Travelers & Expatriates - Chap-
ter 8 - 2016 Yellow Book | Travelers’ Health | CDC,
2016. URL http://wwwnc.cdc.gov/travel/yellowbook/
2016/advising-travelers-with-specific-needs/
long-term-travelers-expatriates.
[33] R. Chandramohanadas, P. H. Davis, D. P. Beiting, M. B.
Harbut, C. Darling, G. Velmourougane, M. Y. Lee, P. A.
Greer, D. S. Roos, and D. C. Greenbaum. Apicomplexan
parasites co-opt host calpains to facilitate their escape from
infected cells. Science (New York, N.Y.), 324(5928):7947, May
2009. ISSN 1095-9203. doi: 10.1126/science.1171085. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3391539&tool=pmcentrez&rendertype=abstract.
[34] R. Chattopadhyay, D. Rathore, H. Fujioka, S. Kumar, P. de la
Vega, D. Haynes, K. Moch, D. Fryauff, R. Wang, D. J. Carucci,
and S. L. Hoffman. PfSPATR, a Plasmodium falciparum pro-
tein containing an altered thrombospondin type I repeat do-
main is expressed at several stages of the parasite life cycle
and is the target of inhibitory antibodies. The Journal of biologi-
cal chemistry, 278(28):2597781, July 2003. ISSN 0021-9258. doi:
10.1074/jbc.M300865200. URL http://www.ncbi.nlm.nih.gov/
pubmed/12716913.
[35] F. Chevenet, C. Brun, A.-L. Bañuls, B. Jacq, and R. Chris-
ten. TreeDyn: towards dynamic graphics and annota-
tions for analyses of trees. BMC bioinformatics, 7:439, Jan.
2006. ISSN 1471-2105. doi: 10.1186/1471-2105-7-439. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1615880&tool=pmcentrez&rendertype=abstract.
[36] M. A. Child, C. Epp, H. Bujard, and M. J. Blackman. Regulated
maturation of malaria merozoite surface protein-1 is essential
for parasite growth. Molecular microbiology, 78(1):187202, Oct.
2010. ISSN 1365-2958. doi: 10.1111/j.1365-2958.2010.07324.
x. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2995310&tool=pmcentrez&rendertype=abstract.
[37] W. Chin, P. G. Contacos, G. R. Coatney, and H. R. Kimball. A
Naturally Acquired Quotidian-Type Malaria in Man Transfer-
able to Monkeys. Science, 149(3686):865865, Aug. 1965. ISSN
0036-8075. doi: 10.1126/science.149.3686.865. URL http://www.
sciencemag.org/content/149/3686/865.1.abstract.
196 references
[38] C. R. Collins, F. Hackett, M. Strath, M. Penzo, C. Withers-
Martinez, D. a. Baker, and M. J. Blackman. Malaria
parasite cGMP-dependent protein kinase regulates
blood stage merozoite secretory organelle discharge
and egress. PLoS pathogens, 9(5):e1003344, May 2013.
ISSN 1553-7374. doi: 10.1371/journal.ppat.1003344. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3649973&tool=pmcentrez&rendertype=abstract.
[39] A. Combe, D. Giovannini, T. G. Carvalho, S. Spath, B. Bois-
son, C. Loussert, S. Thiberge, C. Lacroix, P. Gueirard, and
R. Ménard. Clonal conditional mutagenesis in malaria para-
sites. Cell host & microbe, 5(4):38696, Apr. 2009. ISSN 1934-
6069. doi: 10.1016/j.chom.2009.03.008. URL http://www.ncbi.
nlm.nih.gov/pubmed/19380117.
[40] O. E. Cornejo and A. A. Escalante. The origin and age
of Plasmodium vivax. Trends in parasitology, 22(12):558
63, Dec. 2006. ISSN 1471-4922. doi: 10.1016/j.pt.2006.09.
007. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=1855252&tool=pmcentrez&rendertype=abstract.
[41] A. F. Cowman and B. S. Crabb. Invasion of red blood cells by
malaria parasites. Cell, 124(4):75566, Feb. 2006. ISSN 0092-8674.
doi: 10.1016/j.cell.2006.02.006. URL http://www.ncbi.nlm.nih.
gov/pubmed/16497586.
[42] A. F. Cowman, D. Berry, and J. Baum. The cellular and
molecular basis for malaria parasite invasion of the human
red blood cell. The Journal of cell biology, 198(6):96171, Sept.
2012. ISSN 1540-8140. doi: 10.1083/jcb.201206112. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3444787&tool=pmcentrez&rendertype=abstract.
[43] J. Cox-Singh and R. Culleton. Plasmodium knowlesi: from se-
vere zoonosis to animal model. Trends in parasitology, pages 17,
Mar. 2015. ISSN 1471-5007. doi: 10.1016/j.pt.2015.03.003. URL
http://www.ncbi.nlm.nih.gov/pubmed/25837310.
[44] J. Cox-Singh, T. M. E. Davis, K.-S. Lee, S. S. G. Shamsul, A. Ma-
tusop, S. Ratnam, H. a. Rahman, D. J. Conway, and B. Singh.
Plasmodium knowlesi malaria in humans is widely distributed
and potentially life threatening. Clinical infectious diseases : an
official publication of the Infectious Diseases Society of America, 46
(2):16571, Jan. 2008. ISSN 1537-6591. doi: 10.1086/524888.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2533694&tool=pmcentrez&rendertype=abstract.
[45] C. Crosnier, L. Y. Bustamante, S. J. Bartholdson, A. K.
Bei, M. Theron, M. Uchikawa, S. Mboup, O. Ndir, D. P.
Kwiatkowski, M. T. Duraisingh, J. C. Rayner, and G. J.
references 197
Wright. Basigin is a receptor essential for erythrocyte in-
vasion by Plasmodium falciparum. Nature, 480(7378):5347,
Dec. 2011. ISSN 1476-4687. doi: 10.1038/nature10606. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3245779&tool=pmcentrez&rendertype=abstract.
[46] R. Culleton, C. Coban, F. Y. Zeyrek, P. Cravo, A. Kaneko,
M. Randrianarivelojosia, V. Andrianaranjaka, S. Kano, A. Farn-
ert, A. P. Arez, P. M. Sharp, R. Carter, and K. Tanabe.
The origins of African Plasmodium vivax; insights from
mitochondrial genome sequencing. PloS one, 6(12):e29137,
Jan. 2011. ISSN 1932-6203. doi: 10.1371/journal.pone.
0029137. URL http://journals.plos.org/plosone/article?
id=10.1371/journal.pone.0029137.
[47] C. Daneshvar, T. M. E. Davis, J. Cox-Singh, M. Z. Rafa’ee,
S. K. Zakaria, P. C. S. Divis, and B. Singh. Clinical and
laboratory features of human Plasmodium knowlesi in-
fection. Clinical infectious diseases : an official publication
of the Infectious Diseases Society of America, 49(6):85260,
Sept. 2009. ISSN 1537-6591. doi: 10.1086/605439. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2843824&tool=pmcentrez&rendertype=abstract.
[48] T. F. de Koning-Ward, P. R. Gilson, and B. S. Crabb. Advances
in molecular genetic systems in malaria. Nature reviews. Micro-
biology, 13(6):373387, May 2015. ISSN 1740-1534. doi: 10.1038/
nrmicro3450. URL http://dx.doi.org/10.1038/nrmicro3450.
[49] K. Deitsch, C. Driskill, and T. Wellems. Transformation
of malaria parasites by the spontaneous uptake and ex-
pression of DNA from human erythrocytes. Nucleic acids
research, 29(3):8503, Feb. 2001. ISSN 1362-4962. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=30384&tool=pmcentrez&rendertype=abstract.
[50] X. Deng, R. Gujjar, F. El Mazouni, W. Kaminsky, N. A.
Malmquist, E. J. Goldsmith, P. K. Rathod, and M. A. Phillips.
Structural plasticity of malaria dihydroorotate dehydroge-
nase allows selective binding of diverse chemical scaffolds.
The Journal of biological chemistry, 284(39):269997009, Sept.
2009. ISSN 1083-351X. doi: 10.1074/jbc.M109.028589. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2785385&tool=pmcentrez&rendertype=abstract.
[51] M. B. Denis, R. Tsuyuoka, P. Lim, N. Lindegardh, P. Yi, S. N.
Top, D. Socheat, T. Fandeur, A. Annerberg, E. M. Christophel,
and P. Ringwald. Efficacy of artemether-lumefantrine for the
treatment of uncomplicated falciparum malaria in northwest
Cambodia. Tropical medicine & international health : TM &
IH, 11(12):18007, Dec. 2006. ISSN 1360-2276. doi: 10.1111/
198 references
j.1365-3156.2006.01739.x. URL http://www.ncbi.nlm.nih.gov/
pubmed/17176344.
[52] A. Dereeper, V. Guignon, G. Blanc, S. Audic, S. Buf-
fet, F. Chevenet, J.-F. Dufayard, S. Guindon, V. Lefort,
M. Lescot, J.-M. Claverie, and O. Gascuel. Phylogeny.fr:
robust phylogenetic analysis for the non-specialist. Nu-
cleic acids research, 36(Web Server issue):W4659, July
2008. ISSN 1362-4962. doi: 10.1093/nar/gkn180. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2447785&tool=pmcentrez&rendertype=abstract.
[53] E. Deu, M. J. Leyva, V. E. Albrow, M. J. Rice, J. A. Ell-
man, and M. Bogyo. Functional studies of Plasmodium falci-
parum dipeptidyl aminopeptidase I using small molecule in-
hibitors and active site probes. Chemistry & biology, 17(8):808
19, Aug. 2010. ISSN 1879-1301. doi: 10.1016/j.chembiol.2010.06.
007. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2929396&tool=pmcentrez&rendertype=abstract.
[54] P. C. S. Divis, B. Singh, F. Anderios, S. Hisam, A. Matusop,
C. H. Kocken, S. a. Assefa, C. W. Duffy, and D. J. Conway. Ad-
mixture in Humans of Two Divergent Plasmodium knowlesi
Populations Associated with Different Macaque Host Species.
PLoS pathogens, 11(5):e1004888, May 2015. ISSN 1553-7374. doi:
10.1371/journal.ppat.1004888. URL http://www.ncbi.nlm.nih.
gov/pubmed/26020959.
[55] J. G. Doench, E. Hartenian, D. B. Graham, Z. Tothova, M. Hegde,
I. Smith, M. Sullender, B. L. Ebert, R. J. Xavier, and D. E.
Root. Rational design of highly active sgRNAs for CRISPR-
Cas9-mediated gene inactivation. Nature Biotechnology, 32(12):
12627, Sept. 2014. ISSN 1087-0156. doi: 10.1038/nbt.3026. URL
http://www.ncbi.nlm.nih.gov/pubmed/25184501.
[56] L. Duval, M. Fourment, E. Nerrienet, D. Rousset, S. A. Sadeuh,
S. M. Goodman, N. V. Andriaholinirina, M. Randrianarivelo-
josia, R. E. Paul, V. Robert, F. J. Ayala, and F. Ariey. African
apes as reservoirs of Plasmodium falciparum and the origin
and diversification of the Laverania subgenus. Proceedings of
the National Academy of Sciences of the United States of Amer-
ica, 107(23):105616, June 2010. ISSN 1091-6490. doi: 10.1073/
pnas.1005435107. URL http://www.pnas.org/content/107/23/
10561.
[57] J. A. Dvorak and H. Miller, Louis. Invasion of Erythrocytes by
Malaria Merozoites. Science, 187:57, 1975.
[58] A. Ecker, S. B. Pinto, K. W. Baker, F. C. Kafatos, and
R. E. Sinden. Plasmodium berghei: plasmodium perforin-
like protein 5 is required for mosquito midgut invasion in
references 199
Anopheles stephensi. Experimental parasitology, 116(4):5048,
Aug. 2007. ISSN 0014-4894. doi: 10.1016/j.exppara.2007.01.
015. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=1916484&tool=pmcentrez&rendertype=abstract.
[59] A. Ecker, E. S. C. Bushell, R. Tewari, and R. E. Sinden. Reverse
genetics screen identifies six proteins important for malaria de-
velopment in the mosquito. Molecular microbiology, 70(1):20920,
Oct. 2008. ISSN 1365-2958. doi: 10.1111/j.1365-2958.2008.06407.
x. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2658712&tool=pmcentrez&rendertype=abstract.
[60] R. C. Edgar. MUSCLE: multiple sequence alignment with high
accuracy and high throughput. Nucleic acids research, 32(5):
17927, Jan. 2004. ISSN 1362-4962. doi: 10.1093/nar/gkh340.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=390337&tool=pmcentrez&rendertype=abstract.
[61] S. Egarter, N. Andenmatten, A. J. Jackson, J. A. Whitelaw,
G. Pall, J. A. Black, D. J. P. Ferguson, I. Tardieux, A. Mogilner,
and M. Meissner. The toxoplasma Acto-MyoA motor complex
is important but not essential for gliding motility and host cell
invasion. PloS one, 9(3):e91819, Jan. 2014. ISSN 1932-6203. doi:
10.1371/journal.pone.0091819. URL http://journals.plos.
org/plosone/article?id=10.1371/journal.pone.0091819.
[62] A. Farrell, S. Thirugnanam, A. Lorestani, J. D. Dvorin, K. P.
Eidell, D. J. P. Ferguson, B. R. Anderson-White, M. T. Durais-
ingh, G. T. Marth, and M.-J. Gubbels. A DOC2 protein identi-
fied by mutational profiling is essential for apicomplexan par-
asite exocytosis. Science (New York, N.Y.), 335(6065):21821, Jan.
2012. ISSN 1095-9203. doi: 10.1126/science.1210829. URL
http://www.sciencemag.org/content/335/6065/218.long.
[63] S. Fauquenoy, A. Hovasse, P.-J. Sloves, W. Morelle, T. Dilezi-
toko Alayi, T. Dilezitoko Ayali, C. Slomianny, E. Werkmeister,
C. Schaeffer, A. Van Dorsselaer, and S. Tomavo. Unusual N-
glycan structures required for trafficking Toxoplasma gondii
GAP50 to the inner membrane complex regulate host cell en-
try through parasite motility. Molecular & cellular proteomics
: MCP, 10(9):M111.008953, Sept. 2011. ISSN 1535-9484. doi:
10.1074/mcp.M111.008953. URL http://www.mcponline.org/
content/10/9/M111.008953/F9.expansion.
[64] J. Fonager, E. M. Pasini, J. A. M. Braks, O. Klop, J. Ramesar,
E. J. Remarque, I. O. C. M. Vroegrijk, S. G. van Duinen,
A. W. Thomas, S. M. Khan, M. Mann, C. H. M. Kocken,
C. J. Janse, and B. M. D. Franke-Fayard. Reduced CD36-
dependent tissue sequestration of Plasmodium-infected
erythrocytes is detrimental to malaria parasite growth in
vivo. The Journal of experimental medicine, 209(1):93107, Jan.
200 references
2012. ISSN 1540-9538. doi: 10.1084/jem.20110762. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3260870&tool=pmcentrez&rendertype=abstract.
[65] K. Frénal, C. L. Tay, C. Mueller, E. S. Bushell, Y. Jia, A. Grain-
dorge, O. Billker, J. C. Rayner, and D. Soldati-Favre. Global
analysis of apicomplexan protein S-acyl transferases reveals an
enzyme essential for invasion. Traffic (Copenhagen, Denmark), 14
(8):895911, Aug. 2013. ISSN 1600-0854. doi: 10.1111/tra.12081.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=3813974&tool=pmcentrez&rendertype=abstract.
[66] S. Fyfe, C. Williams, O. J. Mason, and G. J. Pickup. Apophe-
nia, theory of mind and schizotypy: perceiving meaning and
intentionality in randomness. Cortex; a journal devoted to the
study of the nervous system and behavior, 44(10):131625, Jan. .
ISSN 0010-9452. doi: 10.1016/j.cortex.2007.07.009. URL http:
//www.ncbi.nlm.nih.gov/pubmed/18635161.
[67] M. R. Galinski and J. W. Barnwell. Plasmodium vivax:
who cares? Malaria journal, 7 Suppl 1:S9, Jan. 2008.
ISSN 1475-2875. doi: 10.1186/1475-2875-7-S1-S9. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2604873&tool=pmcentrez&rendertype=abstract.
[68] M. R. Galinski, C. C. Medina, P. Ingravallo, and J. W. Barnwell.
A reticulocyte-binding protein complex of plasmodium vivax
merozoites. Cell, 69(7):12131226, June 1992. ISSN 00928674.
doi: 10.1016/0092-8674(92)90642-P. URL http://linkinghub.
elsevier.com/retrieve/pii/009286749290642P.
[69] C. R. Garcia, F. Manzi, F. Tediosi, S. L. Hoffman, and E. R.
James. Comparative cost models of a liquid nitrogen vapor
phase (LNVP) cold chain-distributed cryopreserved malaria
vaccine vs. a conventional vaccine. Vaccine, 31(2):3806,
Jan. 2013. ISSN 1873-2518. doi: 10.1016/j.vaccine.2012.10.
109. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=3666854&tool=pmcentrez&rendertype=abstract.
[70] M. J. Gardner, N. Hall, E. Fung, O. White, M. Berriman, R. W.
Hyman, J. M. Carlton, A. Pain, K. E. Nelson, S. Bowman, I. T.
Paulsen, K. James, J. A. Eisen, K. Rutherford, S. L. Salzberg,
A. Craig, S. Kyes, M.-S. Chan, V. Nene, S. J. Shallom, B. Suh,
J. Peterson, S. Angiuoli, M. Pertea, J. Allen, J. Selengut, D. Haft,
M. W. Mather, A. B. Vaidya, D. M. A. Martin, A. H. Fair-
lamb, M. J. Fraunholz, D. S. Roos, S. A. Ralph, G. I. McFad-
den, L. M. Cummings, G. M. Subramanian, C. Mungall, J. C.
Venter, D. J. Carucci, S. L. Hoffman, C. Newbold, R. W. Davis,
C. M. Fraser, and B. Barrell. Genome sequence of the human
malaria parasite Plasmodium falciparum. Nature, 419(6906):
498511, Oct. 2002. ISSN 0028-0836. doi: 10.1038/nature01097.
URL http://dx.doi.org/10.1038/nature01097.
references 201
[71] S. Y. Gerdes, M. D. Scholle, J. W. Campbell, G. Balázsi,
E. Ravasz, M. D. Daugherty, A. L. Somera, N. C. Kyrpides,
I. Anderson, M. S. Gelfand, A. Bhattacharya, V. Kapatral,
M. D’Souza, M. V. Baev, Y. Grechkin, F. Mseeh, M. Y. Fonstein,
R. Overbeek, A.-L. Barabási, Z. N. Oltvai, and A. L. Oster-
man. Experimental determination and system level analysis
of essential genes in Escherichia coli MG1655. Journal of
bacteriology, 185(19):567384, Oct. 2003. ISSN 0021-9193. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=193955&tool=pmcentrez&rendertype=abstract.
[72] M. Ghorbal, M. Gorman, C. R. Macpherson, R. M. Martins,
A. Scherf, and J.-J. Lopez-Rubio. Genome editing in the human
malaria parasite Plasmodium falciparum using the CRISPR-
Cas9 system. Nature biotechnology, 32(8):819821, 2014. ISSN
1546-1696. doi: 10.1038/nbt.2925. URL http://dx.doi.org/10.
1038/nbt.2925.
[73] G. Giaever, A. M. Chu, L. Ni, C. Connelly, L. Riles, S. Véron-
neau, S. Dow, A. Lucau-Danila, K. Anderson, B. André,
A. P. Arkin, A. Astromoff, M. El-Bakkoury, R. Bangham,
R. Benito, S. Brachat, S. Campanaro, M. Curtiss, K. Davis,
A. Deutschbauer, K.-D. Entian, P. Flaherty, F. Foury, D. J.
Garfinkel, M. Gerstein, D. Gotte, U. Güldener, J. H. Hege-
mann, S. Hempel, Z. Herman, D. F. Jaramillo, D. E. Kelly, S. L.
Kelly, P. Kötter, D. LaBonte, D. C. Lamb, N. Lan, H. Liang,
H. Liao, L. Liu, C. Luo, M. Lussier, R. Mao, P. Menard,
S. L. Ooi, J. L. Revuelta, C. J. Roberts, M. Rose, P. Ross-
Macdonald, B. Scherens, G. Schimmack, B. Shafer, D. D. Shoe-
maker, S. Sookhai-Mahadeo, R. K. Storms, J. N. Strathern,
G. Valle, M. Voet, G. Volckaert, C.-y. Wang, T. R. Ward, J. Wil-
helmy, E. A. Winzeler, Y. Yang, G. Yen, E. Youngman, K. Yu,
H. Bussey, J. D. Boeke, M. Snyder, P. Philippsen, R. W. Davis,
and M. Johnston. Functional profiling of the Saccharomyces
cerevisiae genome. Nature, 418(6896):38791, July 2002. ISSN
0028-0836. doi: 10.1038/nature00935. URL http://www.ncbi.
nlm.nih.gov/pubmed/12140549.
[74] D. G. Gibson, J. I. Glass, C. Lartigue, V. N. Noskov, R.-
Y. Chuang, M. A. Algire, G. A. Benders, M. G. Montague,
L. Ma, M. M. Moodie, C. Merryman, S. Vashee, R. Krishnaku-
mar, N. Assad-Garcia, C. Andrews-Pfannkoch, E. A. Denisova,
L. Young, Z.-Q. Qi, T. H. Segall-Shapiro, C. H. Calvey, P. P.
Parmar, C. A. Hutchison, H. O. Smith, and J. C. Venter. Cre-
ation of a bacterial cell controlled by a chemically synthe-
sized genome. Science (New York, N.Y.), 329(5987):526, July
2010. ISSN 1095-9203. doi: 10.1126/science.1190719. URL
http://www.ncbi.nlm.nih.gov/pubmed/20488990.
[75] P. R. Gilson and B. S. Crabb. Morphology and kinetics of the
three distinct phases of red blood cell invasion by Plasmodium
202 references
falciparum merozoites. International journal for parasitology, 39
(1):916, Jan. 2009. ISSN 1879-0135. doi: 10.1016/j.ijpara.2008.
09.007. URL http://www.ncbi.nlm.nih.gov/pubmed/18952091.
[76] D. Giovannini, S. Späth, C. Lacroix, A. Perazzi, D. Bargieri,
V. Lagal, C. Lebugle, A. Combe, S. Thiberge, P. Baldacci,
I. Tardieux, and R. Ménard. Independent roles of apical mem-
brane antigen 1 and rhoptry neck proteins during host cell in-
vasion by apicomplexa. Cell host & microbe, 10(6):591602, Dec.
2011. ISSN 1934-6069. doi: 10.1016/j.chom.2011.10.012. URL
http://www.ncbi.nlm.nih.gov/pubmed/22177563.
[77] S. Glushakova, V. Lizunov, P. S. Blank, K. Melikov,
G. Humphrey, and J. Zimmerberg. Cytoplasmic free Ca2+ is
essential for multiple steps in malaria parasite egress from in-
fected erythrocytes. Malaria journal, 12(1):41, Jan. 2013. ISSN
1475-2875. doi: 10.1186/1475-2875-12-41. URL http://www.
malariajournal.com/content/12/1/41.
[78] R. Godiska, D. Mead, V. Dhodda, C. Wu, R. Hochstein,
A. Karsi, K. Usdin, A. Entezam, and N. Ravin. Linear plas-
mid vector for cloning of repetitive or unstable sequences
in Escherichia coli. Nucleic acids research, 38(6):e88, Apr.
2010. ISSN 1362-4962. doi: 10.1093/nar/gkp1181. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2847241&tool=pmcentrez&rendertype=abstract.
[79] A. R. Gomes, E. Bushell, F. Schwach, G. Girling, B. Anar, M. A.
Quail, C. Herd, C. Pfander, K. Modrzynska, J. C. Rayner, and
O. Billker. A Genome-Scale Vector Resource Enables High-
Throughput Reverse Genetic Screening in a Malaria Parasite.
Cell Host & Microbe, Feb. 2015. ISSN 19313128. doi: 10.
1016/j.chom.2015.01.014. URL http://www.cell.com/article/
S1931312815000347/fulltext.
[80] A. R. B. Gomes. High-throughput reverse genetic screening in Plas-
modium berghei using barcode sequencing. PhD thesis, 2014.
[81] C. A. Goodman, P. G. Coleman, and A. J. Mills. Cost-
effectiveness of malaria control in sub-Saharan Africa. Lancet
(London, England), 354(9176):37885, July 1999. ISSN 0140-6736.
URL http://www.ncbi.nlm.nih.gov/pubmed/10437867.
[82] J. L. Gordon. Characterization of Actin-like Protein 1 (ALP1),
a Novel Actin-related Protein in the Apicomplexan Parasite Toxo-
plasma Gondii. PhD thesis, Washington University, 2007. URL
https://books.google.com/books?id=QBn6UdlaHwkC&pgis=1.
[83] O. A. Gorleku, A.-M. Barns, G. R. Prescott, J. Greaves, and
L. H. Chamberlain. Endoplasmic reticulum localization of
DHHC palmitoyltransferases mediated by lysine-based sorting
signals. The Journal of biological chemistry, 286(45):3957384, Nov.
references 203
2011. ISSN 1083-351X. doi: 10.1074/jbc.M111.272369. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3234780&tool=pmcentrez&rendertype=abstract.
[84] D. C. Greenbaum, A. Baruch, M. Grainger, Z. Bozdech, K. F.
Medzihradszky, J. Engel, J. DeRisi, A. A. Holder, and M. Bogyo.
A role for the protease falcipain 1 in host cell invasion by the hu-
man malaria parasite. Science (New York, N.Y.), 298(5600):20026,
Dec. 2002. ISSN 1095-9203. doi: 10.1126/science.1077426. URL
http://www.ncbi.nlm.nih.gov/pubmed/12471262.
[85] S. Guindon and O. Gascuel. A simple, fast, and accurate algo-
rithm to estimate large phylogenies by maximum likelihood.
Systematic biology, 52(5):696704, Oct. 2003. ISSN 1063-5157.
URL http://www.ncbi.nlm.nih.gov/pubmed/14530136.
[86] W. E. Gutteridge and P. I. Trigg. Action of pyrimethamine
and related drugs against Plasmodium knowlesi in vitro. Para-
sitology, 62(03):431, Apr. 1971. ISSN 0031-1820. doi: 10.1017/
S0031182000077581. URL http://journals.cambridge.org/
abstract
_
S0031182000077581.
[87] D. S. Guttery, B. Poulin, A. Ramaprasad, R. J. Wall, D. J. P.
Ferguson, D. Brady, E.-M. Patzewitz, S. Whipple, U. Straschil,
M. H. Wright, A. M. A. H. Mohamed, A. Radhakrishnan, S. T.
Arold, E. W. Tate, A. A. Holder, B. Wickstead, A. Pain, and
R. Tewari. Genome-wide functional analysis of Plasmodium
protein phosphatases reveals key regulators of parasite de-
velopment and differentiation. Cell host & microbe, 16(1):128
40, July 2014. ISSN 1934-6069. doi: 10.1016/j.chom.2014.05.
020. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=4094981&tool=pmcentrez&rendertype=abstract.
[88] S. Hasenkamp, K. T. Russell, and P. Horrocks. Com-
parison of the absolute and relative efficiencies of
electroporation-based transfection protocols for Plas-
modium falciparum. Malaria journal, 11(1):210, 2012.
ISSN 1475-2875. doi: 10.1186/1475-2875-11-210. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3407700&tool=pmcentrez&rendertype=abstract.
[89] C. H. Hastings. Novel malaria parasite proteins involved in
erythrocyte invasion. (December), 2012.
[90] P. W. Hedrick. Population genetics of malaria re-
sistance in humans. Heredity, 107(4):283304, Oct.
2011. ISSN 1365-2540. doi: 10.1038/hdy.2011.16. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3182497&tool=pmcentrez&rendertype=abstract.
[91] M. Hirai, M. Arai, S. Kawai, and H. Matsuoka. PbGCbeta is
essential for Plasmodium ookinete motility to invade midgut
204 references
cell and for successful completion of parasite life cycle in
mosquitoes. Journal of biochemistry, 140(5):74757, Nov. 2006.
ISSN 0021-924X. doi: 10.1093/jb/mvj205. URL http://www.
ncbi.nlm.nih.gov/pubmed/17030505.
[92] G. Hu, A. Cabrera, M. Kono, S. Mok, B. K. Chaal, S. Haase,
K. Engelberg, S. Cheemadan, T. Spielmann, P. R. Preiser, T.-W.
Gilberger, and Z. Bozdech. Transcriptional profiling of growth
perturbations of the human malaria parasite Plasmodium falci-
parum. Nature biotechnology, 28(1):918, Jan. 2010. ISSN 1546-
1696. doi: 10.1038/nbt.1597. URL http://www.ncbi.nlm.nih.
gov/pubmed/20037583.
[93] R. Idro, K. Marsh, C. C. John, and C. R. J. Newton. Cere-
bral malaria: Mechanisms of brain injury and strategies for im-
proved neurocognitive outcome. Pediatric Research, 68(4):267
274, 2010. ISSN 00313998. doi: 10.1203/PDR.0b013e3181eee738.
[94] Illumina. Specifications for HiSeq 2500. URL
http://www.illumina.com/systems/hiseq
_
2500
_
1500/
performance
_
specifications.html.
[95] D. Jacot, N. Tosseti, A. Graindorge, and D. Soldati-Favre. Motil-
ity, invasion and egress critically depend on an universal con-
nector that links the actomyosin system with adhesins. In Molec-
ular Parasitology Meeting, 2015.
[96] R. Jambou, F. El-Assaad, V. Combes, and G. E. Grau. In
vitro culture of Plasmodium berghei-ANKA maintains in-
fectivity of mouse erythrocytes inducing cerebral malaria.
Malaria journal, 10(1):346, Jan. 2011. ISSN 1475-2875. doi:
10.1186/1475-2875-10-346. URL http://www.malariajournal.
com/content/10/1/346.
[97] C. J. Janse, B. Franke-Fayard, G. R. Mair, J. Ramesar, C. Thiel,
S. Engelmann, K. Matuschewski, G. J. van Gemert, R. W.
Sauerwein, and A. P. Waters. High efficiency transfec-
tion of Plasmodium berghei facilitates novel selection pro-
cedures. Molecular and biochemical parasitology, 145(1):6070,
Jan. 2006. ISSN 0166-6851. doi: 10.1016/j.molbiopara.2005.09.
007. URL http://www.sciencedirect.com/science/article/
pii/S0166685105002859.
[98] H. Jeong, S. P. Mason, A. L. Barabási, and Z. N. Oltvai. Lethality
and centrality in protein networks. Nature, 411(6833):412, May
2001. ISSN 0028-0836. doi: 10.1038/35075138. URL http://dx.
doi.org/10.1038/35075138.
[99] M. L. Jones, E. L. Kitson, and J. C. Rayner. Plasmodium fal-
ciparum erythrocyte invasion: A conserved myosin associated
complex. Molecular and Biochemical Parasitology, 147(1):7484,
2006. ISSN 01666851. doi: 10.1016/j.molbiopara.2006.01.009.
references 205
[100] S. Jongwutiwes, P. Buppan, R. Kosuvin, S. Seethamchai,
U. Pattanawong, J. Sirichaisinthop, and C. Putaporntip.
Plasmodium knowlesi Malaria in humans and macaques,
Thailand. Emerging infectious diseases, 17(10):1799806, Oct.
2011. ISSN 1080-6059. doi: 10.3201/eid1710.110349. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3310673&tool=pmcentrez&rendertype=abstract.
[101] D. A. Joy, X. Feng, J. Mu, T. Furuya, K. Chotivanich, A. U.
Krettli, M. Ho, A. Wang, N. J. White, E. Suh, P. Beerli, and
X.-z. Su. Early origin and recent expansion of Plasmodium
falciparum. Science (New York, N.Y.), 300(5617):31821, Apr.
2003. ISSN 1095-9203. doi: 10.1126/science.1081449. URL http:
//www.sciencemag.org/content/300/5617/318.abstract.
[102] B. F. C. Kafsack, N. Rovira-Graells, T. G. Clark, C. Bancells,
V. M. Crowley, S. G. Campino, A. E. Williams, L. G. Drought,
D. P. Kwiatkowski, D. A. Baker, A. Cortés, and M. Llinás. A
transcriptional switch underlies commitment to sexual devel-
opment in malaria parasites. Nature, 507(7491):24852, Mar.
2014. ISSN 1476-4687. doi: 10.1038/nature12920. URL http:
//dx.doi.org/10.1038/nature12920.
[103] I. Kaneko, S. Iwanaga, T. Kato, I. Kobayashi, and M. Yuda.
Genome-Wide Identification of the Target Genes of AP2-
O, a Plasmodium AP2-Family Transcription Factor. PLoS
pathogens, 11(5):e1004905, May 2015. ISSN 1553-7374. doi: 10.
1371/journal.ppat.1004905. URL http://journals.plos.org/
plospathogens/article?id=10.1371/journal.ppat.1004905.
[104] S. H. I. Kappe, A. M. Vaughan, J. A. Boddey, and A. F. Cowman.
That Was Then But This Is Now : Malaria Research in the Time
of an. Science, 328, 2010.
[105] S. M. Khan, H. Kroeze, B. Franke-Fayard, and C. J. Janse. Stan-
dardization in generating and reporting genetically modified
rodent malaria parasites: the RMgmDB database. Methods in
molecular biology (Clifton, N.J.), 923:13950, Jan. 2013. ISSN 1940-
6029. doi: 10.1007/978-1-62703-026-7\_9. URL http://www.
ncbi.nlm.nih.gov/pubmed/22990775.
[106] S. Khuri and S. Wuchty. Essentiality and centrality in protein in-
teraction networks revisited. BMC bioinformatics, 16(1):109, Jan.
2015. ISSN 1471-2105. doi: 10.1186/s12859-015-0536-x. URL
http://www.biomedcentral.com/1471-2105/16/109.
[107] L. A. Kirkman, E. A. Lawrence, and K. W. Deitsch.
Malaria parasites utilize both homologous recombina-
tion and alternative end joining pathways to maintain
genome integrity. Nucleic acids research, 42(1):3709, Jan.
2014. ISSN 1362-4962. doi: 10.1093/nar/gkt881. URL
206 references
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3874194&tool=pmcentrez&rendertype=abstract.
[108] R. M. Knowles and B. Das Gupta. A study of monkey-malaria
and its experimental transmission to man. Ind Med Gaz, 67:301
320, 1932.
[109] C. H. M. Kocken, H. Ozwara, A. V. D. Wel, A. L. Beetsma, J. M.
Mwenda, and A. W. Thomas. Plasmodium knowlesi Provides
a Rapid In Vitro and In Vivo Transfection System That Enables
Double-Crossover Gene Knockout Studies. 70(2):655660, 2002.
doi: 10.1128/IAI.70.2.655.
[110] M. Kono, S. Herrmann, N. B. Loughran, A. Cabrera, K. En-
gelberg, C. Lehmann, D. Sinha, B. Prinz, U. Ruch, V. Heus-
sler, T. Spielmann, J. Parkinson, and T. W. Gilberger. Evolu-
tion and architecture of the inner membrane complex in asex-
ual and sexual stages of the malaria parasite. Molecular biology
and evolution, 29(9):211332, Sept. 2012. ISSN 1537-1719. doi:
10.1093/molbev/mss081. URL http://www.ncbi.nlm.nih.gov/
pubmed/22389454.
[111] I. Kozarewa, Z. Ning, M. A. Quail, M. J. Sanders, M. Berri-
man, and D. J. Turner. Amplification-free Illumina sequencing-
library preparation facilitates improved mapping and assem-
bly of (G+C)-biased genomes. Nature methods, 6(4):2915, Apr.
2009. ISSN 1548-7105. doi: 10.1038/nmeth.1311. URL http:
//dx.doi.org/10.1038/nmeth.1311.
[112] U. Krzych, S. Zarling, and A. Pichugin. Memory T cells
maintain protracted protection against malaria. Immunology
letters, 161(2):18995, Oct. 2014. ISSN 1879-0542. doi: 10.
1016/j.imlet.2014.03.011. URL http://www.ncbi.nlm.nih.gov/
pubmed/24709142.
[113] I. Kursula, P. Kursula, M. Ganter, S. Panjikar, K. Matuschewski,
and H. Schüler. Structural basis for parasite-specific functions
of the divergent profilin of Plasmodium falciparum. Structure
(London, England : 1993), 16(11):163848, Nov. 2008. ISSN 0969-
2126. doi: 10.1016/j.str.2008.09.008. URL http://www.ncbi.nlm.
nih.gov/pubmed/19000816.
[114] D. P. Kwiatkowski. How malaria has affected the human
genome and what human genetics can teach us about
malaria. American journal of human genetics, 77(2):17192,
Aug. 2005. ISSN 0002-9297. doi: 10.1086/432519. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1224522&tool=pmcentrez&rendertype=abstract.
[115] a. H. Lee, L. S. Symington, and D. a. Fidock. DNA Repair Mech-
anisms and Their Biological Roles in the Malaria Parasite Plas-
modium falciparum. Microbiology and Molecular Biology Reviews,
references 207
78(3):469486, Sept. 2014. ISSN 1092-2172. doi: 10.1128/MMBR.
00059-13. URL http://mmbr.asm.org/cgi/doi/10.1128/MMBR.
00059-13.
[116] K.-S. Lee, P. C. S. Divis, S. K. Zakaria, A. Matusop, R. A.
Julin, D. J. Conway, J. Cox-Singh, and B. Singh. Plasmodium
knowlesi: reservoir hosts and tracking the emergence in
humans and macaques. PLoS pathogens, 7(4):e1002015, Apr.
2011. ISSN 1553-7374. doi: 10.1371/journal.ppat.1002015. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3072369&tool=pmcentrez&rendertype=abstract.
[117] Leiden. Rodent Malaria Genetic Modification database.
[118] L. Li, C. J. Stoeckert, and D. S. Roos. OrthoMCL: identification
of ortholog groups for eukaryotic genomes. Genome research,
13(9):217889, Sept. 2003. ISSN 1088-9051. doi: 10.1101/gr.
1224503. URL http://genome.cshlp.org/content/13/9/2178.
full.
[119] M. R. Lieber. The mechanism of double-strand DNA break
repair by the nonhomologous DNA end-joining pathway.
Annual review of biochemistry, 79:181211, Jan. 2010. ISSN
1545-4509. doi: 10.1146/annurev.biochem.052308.093131. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3079308&tool=pmcentrez&rendertype=abstract.
[120] C. Lim, E. Hansen, T. M. DeSimone, Y. Moreno, K. Junker,
A. Bei, C. Brugnara, C. O. Buckee, and M. T. Duraisingh.
Expansion of host cellular niche can drive adaptation of a
zoonotic malaria parasite to humans. Nature communications,
4:1638, Jan. 2013. ISSN 2041-1723. doi: 10.1038/ncomms2612.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=3762474&tool=pmcentrez&rendertype=abstract.
[121] C. Lim, J. Goldberg, A. Griggs, C. Gruring, D. Neafsey, and
M. Duraisingh. Changes in Plasmodium knowlesi invasin lig-
and genes are associatied with adaptation to human red blood
cells. In Molecular Parasitology Meeting, 2015.
[122] J. Limenitakis and D. Soldati-Favre. Functional genetics in Api-
complexa: Potentials and limits. FEBS Letters, 585(11):1579
1588, 2011. ISSN 00145793. doi: 10.1016/j.febslet.2011.05.002.
URL http://dx.doi.org/10.1016/j.febslet.2011.05.002.
[123] J. Lin. Generation of genetically attenuated blood-stage malaria par-
asites : characterizing growth and virulence in a rodent model of
malaria. PhD thesis, 2013.
[124] J. W. Lin, P. Meireles, M. Prudêncio, S. Engelmann, T. Annoura,
M. Sajid, S. Chevalley-Maurel, J. Ramesar, C. Nahar, C. M. C.
Avramut, A. J. Koster, K. Matuschewski, A. P. Waters, C. J. Janse,
208 references
G. R. Mair, and S. M. Khan. Loss-of-function analyses defines vi-
tal and redundant functions of the Plasmodium rhomboid pro-
tease family. Molecular Microbiology, 88(2):318338, 2013. ISSN
0950382X. doi: 10.1111/mmi.12187.
[125] S. E. Lindner, K. E. Swearingen, A. Harupa, A. M. Vaughan,
P. Sinnis, R. L. Moritz, and S. H. I. Kappe. Total and Putative
Surface Proteomics of Malaria Parasite Salivary Gland Sporo-
zoites. Molecular & Cellular Proteomics, 12(5):11271143, Jan.
2013. ISSN 1535-9476. doi: 10.1074/mcp.M112.024505. URL
http://www.mcponline.org/content/12/5/1127.full.
[126] W. Liu, Y. Li, G. H. Learn, R. S. Rudicell, J. D. Robertson,
B. F. Keele, J.-B. N. Ndjango, C. M. Sanz, D. B. Morgan,
S. Locatelli, M. K. Gonder, P. J. Kranzusch, P. D. Walsh,
E. Delaporte, E. Mpoudi-Ngole, A. V. Georgiev, M. N. Muller,
G. M. Shaw, M. Peeters, P. M. Sharp, J. C. Rayner, and
B. H. Hahn. Origin of the human malaria parasite Plas-
modium falciparum in gorillas. Nature, 467(7314):420425,
Sept. 2010. ISSN 0028-0836. doi: 10.1038/nature09442. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2997044&tool=pmcentrez&rendertype=abstract.
[127] W. Liu, Y. Li, K. S. Shaw, G. H. Learn, L. J. Plenderleith, J. A.
Malenke, S. A. Sundararaman, M. A. Ramirez, P. A. Crystal,
A. G. Smith, F. Bibollet-Ruche, A. Ayouba, S. Locatelli, A. Este-
ban, F. Mouacha, E. Guichet, C. Butel, S. Ahuka-Mundeke, B.-
I. Inogwabini, J.-B. N. Ndjango, S. Speede, C. M. Sanz, D. B.
Morgan, M. K. Gonder, P. J. Kranzusch, P. D. Walsh, A. V.
Georgiev, M. N. Muller, A. K. Piel, F. A. Stewart, M. L. Wil-
son, A. E. Pusey, L. Cui, Z. Wang, A. Färnert, C. J. Suther-
land, D. Nolder, J. A. Hart, T. B. Hart, P. Bertolani, A. Gillis,
M. LeBreton, B. Tafon, J. Kiyang, C. F. Djoko, B. S. Schnei-
der, N. D. Wolfe, E. Mpoudi-Ngole, E. Delaporte, R. Carter,
R. L. Culleton, G. M. Shaw, J. C. Rayner, M. Peeters, B. H.
Hahn, and P. M. Sharp. African origin of the malaria par-
asite Plasmodium vivax. Nature communications, 5:3346, Jan.
2014. ISSN 2041-1723. doi: 10.1038/ncomms4346. URL http:
//europepmc.org/articles/PMC4089193.
[128] M. J. López-Barragán, J. Lemieux, M. Quiñones, K. C.
Williamson, A. Molina-Cruz, K. Cui, C. Barillas-Mury,
K. Zhao, and X.-z. Su. Directional gene expression and
antisense transcripts in sexual and asexual stages of Plas-
modium falciparum. BMC genomics, 12:587, Jan. 2011.
ISSN 1471-2164. doi: 10.1186/1471-2164-12-587. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3266614&tool=pmcentrez&rendertype=abstract.
[129] A. Lorestani, L. Sheiner, K. Yang, S. D. Robertson, N. Sahoo,
C. F. Brooks, D. J. P. Ferguson, B. Striepen, and M.-J. Gubbels.
references 209
A Toxoplasma MORN1 null mutant undergoes repeated divi-
sions but is defective in basal assembly, apicoplast division and
cytokinesis. PloS one, 5(8):e12302, Jan. 2010. ISSN 1932-6203. doi:
10.1371/journal.pone.0012302. URL http://journals.plos.
org/plosone/article?id=10.1371/journal.pone.0012302.
[130] L. J. Mackey, A. Hochmann, C. H. June, C. E. Contr-
eras, and P. H. Lambert. Immunopathological aspects
of Plasmodium berghei infection in five strains of mice.
II. Immunopathology of cerebral and other tissue le-
sions during the infection. Clinical and experimental im-
munology, 42(3):41220, Dec. 1980. ISSN 0009-9104. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1537166&tool=pmcentrez&rendertype=abstract.
[131] M. Madamet. Malaria Prophylaxis Failure with Doxycycline,
Central African Republic, 2014. Emerging Infectious Disease jour-
nal, 21(8), 2015. URL http://wwwnc.cdc.gov/eid/article/21/
8/15-0524
_
article.
[132] B. Mahajan, D. Jani, R. Chattopadhyay, R. Nagarkatti, H. Zheng,
V. Majam, W. Weiss, S. Kumar, and D. Rathore. Identification,
cloning, expression, and characterization of the gene for Plas-
modium knowlesi surface protein containing an altered throm-
bospondin repeat domain. Infection and immunity, 73(9):54029,
Sept. 2005. ISSN 0019-9567. doi: 10.1128/IAI.73.9.5402-5409.
2005. URL http://iai.asm.org/content/73/9/5402.full.
[133] P. Mali, K. M. Esvelt, and G. M. Church. Cas9 as a versatile
tool for engineering biology. Nature methods, 10(10):95763, Oct.
2013. ISSN 1548-7105. doi: 10.1038/nmeth.2649. URL http:
//dx.doi.org/10.1038/nmeth.2649.
[134] B. Malleret, A. Li, R. Zhang, K. S. W. Tan, R. Suwanarusk,
C. Claser, J. S. Cho, E. G. L. Koh, C. S. Chu, S. Pukrit-
tayakamee, M. L. Ng, F. Ginhoux, L. G. Ng, C. T. Lim, F. Nos-
ten, G. Snounou, L. Rénia, and B. Russell. Plasmodium vi-
vax: restricted tropism and rapid remodeling of CD71-positive
reticulocytes. Blood, 125(8):131424, Feb. 2015. ISSN 1528-
0020. doi: 10.1182/blood-2014-08-596015. URL http://www.
bloodjournal.org/content/125/8/1314.abstract.
[135] C. B. Mamoun, I. Y. Gluzman, S. Goyard, S. M. Bev-
erley, and D. E. Goldberg. A set of independent se-
lectable markers for transfection of the human malaria
parasite Plasmodium falciparum. Proceedings of the Na-
tional Academy of Sciences of the United States of Amer-
ica, 96(15):871620, July 1999. ISSN 0027-8424. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=17582&tool=pmcentrez&rendertype=abstract.
210 references
[136] M. Mani, K. Kandavelou, F. J. Dy, S. Durai, and S. Chan-
drasegaran. Design, engineering, and characterization of zinc
finger nucleases. Biochemical and Biophysical Research Communi-
cations, 335(2):447457, 2005. ISSN 0006291X. doi: 10.1016/j.
bbrc.2005.07.089.
[137] L. Martín-Jaular, A. Elizalde-Torrent, R. Thomson-Luque,
M. Ferrer, J. C. Segovia, E. Herreros-Aviles, C. Fernández-
Becerra, and H. A. del Portillo. Reticulocyte-prone malaria par-
asites predominantly invade CD71hi immature cells: implica-
tions for the development of an in vitro culture for Plasmod-
ium vivax. Malaria Journal, 12(1):434, 2013. ISSN 1475-2875. doi:
10.1186/1475-2875-12-434. URL http://www.malariajournal.
com/content/12/1/434.
[138] M. T. McIntosh, A. Vaid, H. D. Hosgood, J. Vijay, A. Bhat-
tacharya, M. H. Sahani, P. Baevova, K. A. Joiner, and P. Sharma.
Traffic to the malaria parasite food vacuole: a novel path-
way involving a phosphatidylinositol 3-phosphate-binding pro-
tein. The Journal of biological chemistry, 282(15):11499508, Apr.
2007. ISSN 0021-9258. doi: 10.1074/jbc.M610974200. URL
http://www.jbc.org/content/282/15/11499.
[139] D. Ménard, C. Barnadas, C. Bouchier, C. Henry-Halldin, L. R.
Gray, A. Ratsimbasoa, V. Thonier, J.-F. Carod, O. Domarle,
Y. Colin, O. Bertrand, J. Picot, C. L. King, B. T. Grimberg,
O. Mercereau-Puijalon, and P. A. Zimmerman. Plasmodium
vivax clinical malaria is commonly observed in Duffy-negative
Malagasy people. Proceedings of the National Academy of
Sciences of the United States of America, 107(13):596771, Mar.
2010. ISSN 1091-6490. doi: 10.1073/pnas.0912496107. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2851935&tool=pmcentrez&rendertype=abstract.
[140] E. V. S. Meyer, A. A. Semenya, D. M. N. Okenu, R. Anton, L. H.
Bannister, J. W. Barnwell, and M. R. Galinski. The reticulocyte
binding-like proteins of P. knowlesi locate to the micronemes
of merozoites and define two new members of this invasion
ligand family. Mol Biochem Parasitol, 165(2):111121, 2009. doi:
10.1016/j.molbiopara.2009.01.012.The.
[141] L. H. Miller, S. J. Mason, D. F. Clyde, and M. H. McGinniss.
The Resistance Factor to Plasmodium vivax in Blacks. New
England Journal of Medicine, 295(6):302304, Aug. 1976. ISSN
0028-4793. doi: 10.1056/NEJM197608052950602. URL http:
//www.ncbi.nlm.nih.gov/pubmed/778616.
[142] R. W. Moon, J. Hall, F. Rangkuti, Y. S. Ho, N. Almond, G. H.
Mitchell, A. Pain, A. a. Holder, and M. J. Blackman. Adapta-
tion of the genetically tractable malaria pathogen Plasmodium
references 211
knowlesi to continuous culture in human erythrocytes. Pro-
ceedings of the National Academy of Sciences of the United States of
America, 110(2):5316, Jan. 2013. ISSN 1091-6490. doi: 10.1073/
pnas.1216457110. URL http://www.ncbi.nlm.nih.gov/pubmed/
23267069.
[143] S. Moonah, N. G. Sanders, J. K. Persichetti, and
D. J. Sullivan. Erythrocyte lysis and Xenopus lae-
vis oocyte rupture by recombinant Plasmodium falci-
parum hemolysin III. Eukaryotic cell, 13(10):133745, Oct.
2014. ISSN 1535-9786. doi: 10.1128/EC.00088-14. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=4187644&tool=pmcentrez&rendertype=abstract.
[144] R. R. Moraes Barros, J. Straimer, J. M. Sa, R. E. Salzman, V. A.
Melendez-Muniz, J. Mu, D. A. Fidock, and T. E. Wellems. Edit-
ing the Plasmodium vivax genome, using zinc-finger nucleases.
The Journal of infectious diseases, 211(1):1259, Jan. 2015. ISSN
1537-6613. doi: 10.1093/infdis/jiu423. URL http://www.ncbi.
nlm.nih.gov/pubmed/25081932.
[145] E. J. Mui, G. A. Schiehser, W. K. Milhous, H. Hsu, C. W.
Roberts, M. Kirisits, S. Muench, D. Rice, J. P. Dubey, J. W. Fow-
ble, P. K. Rathod, S. F. Queener, S. R. Liu, D. P. Jacobus, and
R. McLeod. Novel triazine JPC-2067-B inhibits Toxoplasma
gondii in vitro and in vivo. PLoS neglected tropical diseases, 2
(3):e190, Jan. 2008. ISSN 1935-2735. doi: 10.1371/journal.pntd.
0000190. URL http://journals.plos.org/plosntds/article?
id=10.1371/journal.pntd.0000190.
[146] C. Nusslein-Volhard. The identification of genes controlling de-
velopment in flies and fishes. In Nobel Lecture, 1995.
[147] R. A. O’Donnell, L. H. Freitas-Junior, P. R. Preiser, D. H.
Williamson, M. Duraisingh, T. F. McElwain, A. Scherf, A. F.
Cowman, and B. S. Crabb. A genetic screen for improved plas-
mid segregation reveals a role for Rep20 in the interaction of
Plasmodium falciparum chromosomes. The EMBO journal, 21
(5):12319, Mar. 2002. ISSN 0261-4189. doi: 10.1093/emboj/
21.5.1231. URL http://emboj.embopress.org/content/21/5/
1231.abstract.
[148] R. A. O’Donnell, F. Hackett, S. A. Howell, M. Treeck, N. Struck,
Z. Krnajski, C. Withers-Martinez, T. W. Gilberger, and M. J.
Blackman. Intramembrane proteolysis mediates shedding of
a key adhesin during erythrocyte invasion by the malaria par-
asite. The Journal of cell biology, 174(7):102333, Sept. 2006.
ISSN 0021-9525. doi: 10.1083/jcb.200604136. URL http://jcb.
rupress.org/content/174/7/1023.
[149] W. P. O’Meara, J. N. Mangeni, R. Steketee, and B. Greenwood.
Changes in the burden of malaria in sub-Saharan Africa. The
212 references
Lancet. Infectious diseases, 10(8):54555, Aug. 2010. ISSN 1474-
4457. doi: 10.1016/S1473-3099(10)70096-7. URL http://www.
ncbi.nlm.nih.gov/pubmed/20637696.
[150] T. D. Otto, U. Böhme, A. P. Jackson, M. Hunt, B. Franke-Fayard,
W. A. M. Hoeijmakers, A. A. Religa, L. Robertson, M. Sanders,
S. A. Ogun, D. Cunningham, A. Erhart, O. Billker, S. M. Khan,
H. G. Stunnenberg, J. Langhorne, A. A. Holder, A. P. Waters,
C. I. Newbold, A. Pain, M. Berriman, and C. J. Janse. A com-
prehensive evaluation of rodent malaria parasite genomes and
gene expression. BMC biology, 12:86, Jan. 2014. ISSN 1741-7007.
doi: 10.1186/s12915-014-0086-0. URL http://europepmc.org/
articles/PMC4242472/?report=abstract.
[151] a. Pain, U. Böhme, a. E. Berry, K. Mungall, R. D. Finn, a. P.
Jackson, T. Mourier, J. Mistry, E. M. Pasini, M. a. Aslett,
S. Balasubrammaniam, K. Borgwardt, K. Brooks, C. Carret,
T. J. Carver, I. Cherevach, T. Chillingworth, T. G. Clark, M. R.
Galinski, N. Hall, D. Harper, D. Harris, H. Hauser, A. Ivens,
C. S. Janssen, T. Keane, N. Larke, S. Lapp, M. Marti, S. Moule,
I. M. Meyer, D. Ormond, N. Peters, M. Sanders, S. Sanders,
T. J. Sargeant, M. Simmonds, F. Smith, R. Squares, S. Thurston,
a. R. Tivey, D. Walker, B. White, E. Zuiderwijk, C. Churcher,
M. a. Quail, a. F. Cowman, C. M. R. Turner, M. a. Rajandream,
C. H. M. Kocken, a. W. Thomas, C. I. Newbold, B. G. Barrell,
and M. Berriman. The genome of the simian and human
malaria parasite Plasmodium knowlesi. Nature, 455(7214):
799803, Oct. 2008. ISSN 1476-4687. doi: 10.1038/nature07306.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2656934&tool=pmcentrez&rendertype=abstract.
[152] I. Pal-Bhowmick, J. Andersen, P. Srinivasan, D. L. Narum,
J. Bosch, and L. H. Miller. Binding of aldolase and
glyceraldehyde-3-phosphate dehydrogenase to the cyto-
plasmic tails of Plasmodium falciparum merozoite duffy
binding-like and reticulocyte homology ligands. mBio, 3(5),
Jan. 2012. ISSN 2150-7511. doi: 10.1128/mBio.00292-12. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3448169&tool=pmcentrez&rendertype=abstract.
[153] K. C. Pandey, N. Singh, S. Arastu-Kapur, M. Bogyo,
and P. J. Rosenthal. Falstatin, a cysteine protease in-
hibitor of Plasmodium falciparum, facilitates erythro-
cyte invasion. PLoS pathogens, 2(11):e117, Nov. 2006.
ISSN 1553-7374. doi: 10.1371/journal.ppat.0020117. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1630708&tool=pmcentrez&rendertype=abstract.
[154] C. Pfander, B. Anar, F. Schwach, T. D. Otto, M. Brochet, K. Volk-
mann, M. A. Quail, A. Pain, B. Rosen, W. Skarnes, J. C. Rayner,
and O. Billker. A scalable pipeline for highly effective genetic
references 213
modification of a malaria parasite. Nature methods, 8(12):410,
2011. doi: 10.1038/nMeth.1742.
[155] C. Pfander, B. Anar, F. Schwach, T. D. Otto, M. Brochet, K. Volk-
mann, M. A. Quail, A. Pain, B. Rosen, W. Skarnes, J. C. Rayner,
and O. Billker. A scalable pipeline for highly effective genetic
modification of a malaria parasite. Nature methods, 8(12):1078
82, Dec. 2011. ISSN 1548-7105. doi: 10.1038/nmeth.1742. URL
http://dx.doi.org/10.1038/nmeth.1742.
[156] M. M. Pinheiro, M. A. Ahmed, S. B. Millar, T. Sanderson, T. D.
Otto, W. C. Lu, S. Krishna, J. C. Rayner, and J. Cox-Singh.
Plasmodium knowlesi genome sequences from clinical isolates
reveal extensive genomic dimorphism. PloS one, 10(4):e0121303,
Jan. 2015. ISSN 1932-6203. doi: 10.1371/journal.pone.0121303.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=4382175&tool=pmcentrez&rendertype=abstract.
[157] PlasmoDB. PlasmoDB: An integrative database of the
Plasmodium falciparum genome. Tools for accessing and
analyzing finished and unfinished sequence data. The
Plasmodium Genome Database Collaborative. Nucleic
acids research, 29(1):669, Jan. 2001. ISSN 1362-4962. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=29846&tool=pmcentrez&rendertype=abstract.
[158] R. N. Price, E. Tjitra, C. a. Guerra, S. Yeung, N. J. White,
and N. M. Anstey. Vivax malaria: neglected and not
benign. The American journal of tropical medicine and hy-
giene, 77(6 Suppl):7987, Dec. 2007. ISSN 0002-9637. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2653940&tool=pmcentrez&rendertype=abstract.
[159] M. Prudêncio, A. Rodriguez, and M. M. Mota. The silent path
to thousands of merozoites: the Plasmodium liver stage. Na-
ture Reviews Microbiology, 4(11):849856, Nov. 2006. ISSN 1740-
1526. doi: 10.1038/nrmicro1529. URL http://www.nature.com/
doifinder/10.1038/nrmicro1529.
[160] D. Quammem. Spillover: Animal Infections and the Next Human
Pandemic. 2012.
[161] N. B. Quashie, H. P. de Koning, and L. C. Ranford-Cartwright.
An improved and highly sensitive microfluorimetric method
for assessing susceptibility of Plasmodium falciparum to an-
timalarial drugs in vitro. Malaria journal, 5(1):95, Jan. 2006.
ISSN 1475-2875. doi: 10.1186/1475-2875-5-95. URL http://www.
malariajournal.com/content/5/1/95.
[162] J. C. Rayner, E. Vargas-serrato, C. S. Huber, M. R. Galinski, and
J. W. Barnwell. A Plasmodium falciparum Homologue of Plas-
modium vivax Reticulocyte Binding Protein ( PvRBP1 ) Defines
214 references
a Trypsin-resistant Erythrocyte Invasion Pathway. Journal of Ex-
perimental Medicine, 194(11), 2001.
[163] K. G. L. Roch, Y. Zhou, P. L. Blair, M. Grainger, J. K. Moch, J. D.
Haynes, P. D. Vega, A. A. Holder, S. Batalov, D. J. Carucci, and
E. A. Winzeler. Discovery of Gene Function by. 301(September):
15031508, 2003.
[164] C. V. Rooyen and G. R. Pile. Observations on infection by plas-
modium knowlesi ( ape malaria) in the treatment of general
paralysis of the insane. pages 662666, 1935.
[165] M. Rougemont, M. V. Saanen, R. Sahli, H. P. Hinrikson, J. Bille,
and K. Jaton. Detection of Four Plasmodium Species in Blood
from Humans by 18S rRNA Gene Subunit-Based and Species-
Specific Real-Time PCR Assays. 42(12):56365643, 2004. doi:
10.1128/JCM.42.12.5636.
[166] K. J. Roux, D. I. Kim, M. Raida, and B. Burke. A promiscuous
biotin ligase fusion protein identifies proximal and interacting
proteins in mammalian cells. The Journal of cell biology, 196(6):
80110, Mar. 2012. ISSN 1540-8140. doi: 10.1083/jcb.201112098.
URL http://jcb.rupress.org/content/196/6/801.full.
[167] N. Roy, S. Bhattacharyya, S. Chakrabarty, S. Laskar, S. M. Babu,
and M. K. Bhattacharyya. Dominant negative mutant of Plas-
modium Rad51 causes reduced parasite burden in host by abro-
gating DNA double-strand break repair. Molecular Microbiology,
94(2):353366, 2014. ISSN 0950382X. doi: 10.1111/mmi.12762.
URL http://doi.wiley.com/10.1111/mmi.12762.
[168] S. C. T. P. RTS. Efficacy and safety of RTS,S/AS01 malaria
vaccine with or without a booster dose in infants and chil-
dren in Africa: final results of a phase 3, individually ran-
domised, controlled trial. The Lancet, 386(9988):3145, Apr. 2015.
ISSN 01406736. doi: 10.1016/S0140-6736(15)60721-8. URL http:
//www.thelancet.com/article/S0140673615607218/fulltext.
[169] N. Rudin, E. Sugarman, and J. E. Haber. Genetic
and physical analysis of double-strand break repair
and recombination in Saccharomyces cerevisiae. Ge-
netics, 122(3):51934, July 1989. ISSN 0016-6731. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1203726&tool=pmcentrez&rendertype=abstract.
[170] A. Ruecker, M. Shea, F. Hackett, C. Suarez, E. M. A. Hirst, K. Mi-
lutinovic, C. Withers-Martinez, and M. J. Blackman. Proteolytic
activation of the essential parasitophorous vacuole cysteine pro-
tease SERA6 accompanies malaria parasite egress from its host
erythrocyte. The Journal of biological chemistry, 287(45):3794963,
Nov. 2012. ISSN 1083-351X. doi: 10.1074/jbc.M112.400820. URL
references 215
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3488066&tool=pmcentrez&rendertype=abstract.
[171] P. Rumore, B. Muralidhar, M. Lin, C. Lai, and C. R.
Steinman. Haemodialysis as a model for studying
endogenous plasma DNA: oligonucleosome-like struc-
ture and clearance. Clinical and experimental immunol-
ogy, 90(1):5662, Oct. 1992. ISSN 0009-9104. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1554541&tool=pmcentrez&rendertype=abstract.
[172] R. A. Seder, L.-J. Chang, M. E. Enama, K. L. Zephir, U. N. Sar-
war, I. J. Gordon, L. A. Holman, E. R. James, P. F. Billingsley,
A. Gunasekera, A. Richman, S. Chakravarty, A. Manoj, S. Vel-
murugan, M. Li, A. J. Ruben, T. Li, A. G. Eappen, R. E. Stafford,
S. H. Plummer, C. S. Hendel, L. Novik, P. J. M. Costner, F. H.
Mendoza, J. G. Saunders, M. C. Nason, J. H. Richardson, J. Mur-
phy, S. A. Davidson, T. L. Richie, M. Sedegah, A. Sutamihardja,
G. A. Fahle, K. E. Lyke, M. B. Laurens, M. Roederer, K. Tewari,
J. E. Epstein, B. K. L. Sim, J. E. Ledgerwood, B. S. Graham, and
S. L. Hoffman. Protection against malaria by intravenous im-
munization with a nonreplicating sporozoite vaccine. Science
(New York, N.Y.), 341(6152):135965, Sept. 2013. ISSN 1095-9203.
doi: 10.1126/science.1241800. URL http://www.ncbi.nlm.nih.
gov/pubmed/23929949.
[173] B. Shen and L. D. Sibley. Toxoplasma aldolase is required for
metabolism but dispensable for host-cell invasion. Proceedings
of the National Academy of Sciences, 111(9):35673572, Mar.
2014. ISSN 0027-8424. doi: 10.1073/pnas.1315156111. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3948255&tool=pmcentrez&rendertype=abstract.
[174] P. S. Sijwali, K. Kato, K. B. Seydel, J. Gut, J. Lehman, M. Klemba,
D. E. Goldberg, L. H. Miller, and P. J. Rosenthal. Plasmodium
falciparum cysteine protease falcipain-1 is not essential in
erythrocytic stage malaria parasites. Proceedings of the National
Academy of Sciences of the United States of America, 101(23):87216,
June 2004. ISSN 0027-8424. doi: 10.1073/pnas.0402738101. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=423262&tool=pmcentrez&rendertype=abstract.
[175] J. C. Silva, A. Egan, R. Friedman, J. B. Munro, J. M. Carlton,
and A. L. Hughes. Genome sequences reveal divergence times
of malaria parasite lineages. Parasitology, 138(13):173749, Nov.
2011. ISSN 1469-8161. doi: 10.1017/S0031182010001575. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3081533&tool=pmcentrez&rendertype=abstract.
[176] B. Singh, L. Kim Sung, A. Matusop, A. Radhakrishnan, S. S. G.
Shamsul, J. Cox-Singh, A. Thomas, and D. J. Conway. A large
216 references
focus of naturally acquired Plasmodium knowlesi infections in
human beings. Lancet, 363(9414):101724, Mar. 2004. ISSN 1474-
547X. doi: 10.1016/S0140-6736(04)15836-4. URL http://www.
ncbi.nlm.nih.gov/pubmed/15051281.
[177] S. Singh, M. M. Alam, I. Pal-Bhowmick, J. A. Brzostowski,
and C. E. Chitnis. Distinct external signals trigger sequen-
tial release of apical organelles during erythrocyte invasion
by malaria parasites. PLoS pathogens, 6(2):e1000746, Feb.
2010. ISSN 1553-7374. doi: 10.1371/journal.ppat.1000746.
URL http://journals.plos.org/plospathogens/article?id=
10.1371/journal.ppat.1000746.
[178] A. Sinha, K. R. Hughes, K. K. Modrzynska, T. D. Otto, C. Pfan-
der, N. J. Dickens, A. A. Religa, E. Bushell, A. L. Graham,
R. Cameron, B. F. C. Kafsack, A. E. Williams, M. Llinás, M. Ber-
riman, O. Billker, and A. P. Waters. A cascade of DNA-binding
proteins for sexual commitment and development in Plasmod-
ium. Nature, 507(7491):2537, Mar. 2014. ISSN 1476-4687.
doi: 10.1038/nature12970. URL http://dx.doi.org/10.1038/
nature12970.
[179] S. Sinha, B. Medhi, and R. Sehgal. Challenges of drug-
resistant malaria. Parasite (Paris, France), 21:61, Jan. 2014.
ISSN 1776-1042. doi: 10.1051/parasite/2014059. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=4234044&tool=pmcentrez&rendertype=abstract.
[180] M. E. Sinka, M. J. Bangs, S. Manguin, T. Chareonviriyaphap,
A. P. Patil, W. H. Temperley, P. W. Gething, I. R. F. Elyazar, C. W.
Kabaria, R. E. Harbach, and S. I. Hay. The dominant Anopheles
vectors of human malaria in the Asia-Pacific region: occurrence
data, distribution maps and bionomic précis. Parasites & vectors,
4(1):89, Jan. 2011. ISSN 1756-3305. doi: 10.1186/1756-3305-4-89.
URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=3127851&tool=pmcentrez&rendertype=abstract.
[181] L. Solyakov, J. Halbert, M. M. Alam, J.-P. Semblat, D. Dorin-
Semblat, L. Reininger, A. R. Bottrill, S. Mistry, A. Abdi, C. Fen-
nell, Z. Holland, C. Demarta, Y. Bouza, A. Sicard, M.-P. Nivez,
S. Eschenlauer, T. Lama, D. C. Thomas, P. Sharma, S. Agar-
wal, S. Kern, G. Pradel, M. Graciotti, A. B. Tobin, and C. Do-
erig. Global kinomic and phospho-proteomic analyses of the
human malaria parasite Plasmodium falciparum. Nature com-
munications, 2:565, Jan. 2011. ISSN 2041-1723. doi: 10.1038/
ncomms1558. URL http://dx.doi.org/10.1038/ncomms1558.
[182] M. D. Spring, J. T. Lin, J. E. Manning, P. Vanachayangkul,
S. Somethy, R. Bun, Y. Se, S. Chann, M. Ittiverakul, P. Sia-
ngam, W. Kuntawunginn, M. Arsanok, N. Buathong, S. Chao-
rattanakawee, P. Gosi, W. Ta-aksorn, N. Chanarat, S. Sun-
drakes, N. Kong, T. K. Heng, S. Nou, P. Teja-isavadharm,
references 217
S. Pichyangkul, S. T. Phann, S. Balasubramanian, J. J. Juliano,
S. R. Meshnick, C. M. Chour, S. Prom, C. A. Lanteri, C. Lon,
and D. L. Saunders. Dihydroartemisinin-piperaquine failure as-
sociated with a triple mutant including kelch13 C580Y in Cam-
bodia: an observational cohort study. The Lancet. Infectious dis-
eases, 15(6):68391, June 2015. ISSN 1474-4457. doi: 10.1016/
S1473-3099(15)70049-6. URL http://www.ncbi.nlm.nih.gov/
pubmed/25877962.
[183] A. Srivastava, S. Singh, S. Dhawan, M. Mahmood Alam,
A. Mohmmed, and C. E. Chitnis. Localization of apical sushi
protein in Plasmodium falciparum merozoites. Molecular and
biochemical parasitology, 174(1):669, Nov. 2010. ISSN 1872-9428.
doi: 10.1016/j.molbiopara.2010.06.003. URL http://www.ncbi.
nlm.nih.gov/pubmed/20540969.
[184] J. Straimer, M. C. S. Lee, A. H. Lee, B. Zeitler, A. E. Williams,
J. R. Pearl, L. Zhang, E. J. Rebar, P. D. Gregory, M. Llinás, F. D.
Urnov, and D. a. Fidock. Site-Specific Editing of the Plasmod-
ium falciparum Genome Using Engineered Zinc-Finger Nucle-
ases. Nature methods, 9(10):993998, 2013. doi: 10.1038/nmeth.
2143.Site-Specific.
[185] A. Sturm, R. Amino, C. van de Sand, T. Regen, S. Retzlaff,
A. Rennenberg, A. Krueger, J.-M. Pollok, R. Menard, and V. T.
Heussler. Manipulation of host hepatocytes by the malaria
parasite for delivery into liver sinusoids. Science (New York,
N.Y.), 313(5791):128790, Sept. 2006. ISSN 1095-9203. doi:
10.1126/science.1129720. URL http://www.ncbi.nlm.nih.gov/
pubmed/16888102.
[186] C. J. Sutherland, N. Tanomsing, D. Nolder, M. Oguike, C. Jenni-
son, S. Pukrittayakamee, C. Dolecek, T. T. Hien, V. E. do Rosário,
A. P. Arez, J. a. Pinto, P. Michon, A. a. Escalante, F. Nosten,
M. Burke, R. Lee, M. Blaze, T. D. Otto, J. W. Barnwell, A. Pain,
J. Williams, N. J. White, N. P. J. Day, G. Snounou, P. J. Lockhart,
P. L. Chiodini, M. Imwong, and S. D. Polley. Two nonrecombin-
ing sympatric forms of the human malaria parasite Plasmod-
ium ovale occur globally. The Journal of infectious diseases, 201
(10):154450, May 2010. ISSN 1537-6613. doi: 10.1086/652240.
URL http://www.ncbi.nlm.nih.gov/pubmed/20380562.
[187] T. Taechalertpaisarn, C. Crosnier, S. J. Bartholdson, A. N. Hod-
der, J. Thompson, L. Y. Bustamante, D. W. Wilson, P. R. Sanders,
G. J. Wright, J. C. Rayner, A. F. Cowman, P. R. Gilson, and B. S.
Crabb. Biochemical and functional analysis of two Plasmodium
falciparum blood-stage 6-cys proteins: P12 and P41. PloS one, 7
(7):e41937, Jan. 2012. ISSN 1932-6203. doi: 10.1371/journal.pone.
0041937. URL http://journals.plos.org/plosone/article?
id=10.1371/journal.pone.0041937.
218 references
[188] C. L. Tay. The targets and role of palmitoylation in Plasmodium
parasites. PhD thesis, 2013.
[189] R. Tewari, U. Straschil, A. Bateman, U. Böhme, I. Cherevach,
P. Gong, A. Pain, and O. Billker. The systematic functional
analysis of Plasmodium protein kinases identifies essential reg-
ulators of mosquito transmission. Cell host & microbe, 8(4):377
87, Oct. 2010. ISSN 1934-6069. doi: 10.1016/j.chom.2010.09.
006. URL http://www.pubmedcentral.nih.gov/articlerender.
fcgi?artid=2977076&tool=pmcentrez&rendertype=abstract.
[190] M. Theron, R. L. Hesketh, S. Subramanian, and J. C. Rayner.
An adaptable two-color flow cytometric assay to quantitate
the invasion of erythrocytes by Plasmodium falciparum
parasites. Cytometry. Part A : the journal of the Interna-
tional Society for Analytical Cytology, 77(11):106774, Nov.
2010. ISSN 1552-4930. doi: 10.1002/cyto.a.20972. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3047707&tool=pmcentrez&rendertype=abstract.
[191] J. Thompson, R. E. Cooke, S. Moore, L. F. Anderson, C. J.
Janse, and A. P. Waters. PTRAMP; a conserved Plasmod-
ium thrombospondin-related apical merozoite protein. Molec-
ular and biochemical parasitology, 134(2):22532, Apr. 2004. ISSN
0166-6851. doi: 10.1016/j.molbiopara.2003.12.003. URL http:
//www.ncbi.nlm.nih.gov/pubmed/15003842.
[192] J. Thurston. The action of antimalarial drugs in mice infected
with Plasmodium berghei. British journal of pharmacology and
chemotherapy, 5(3):40916, Sept. 1950. ISSN 0366-0826. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=1509934&tool=pmcentrez&rendertype=abstract.
[193] C. J. Tonkin, G. G. van Dooren, T. P. Spurck, N. S. Struck,
R. T. Good, E. Handman, A. F. Cowman, and G. I. McFad-
den. Localization of organellar proteins in Plasmodium fal-
ciparum using a novel set of transfection vectors and a new
immunofluorescence fixation method. Molecular and biochemi-
cal parasitology, 137(1):1321, Sept. 2004. ISSN 0166-6851. doi:
10.1016/j.molbiopara.2004.05.009. URL http://www.ncbi.nlm.
nih.gov/pubmed/15279947.
[194] W. Trager and J. B. Jensen. Human malaria parasites in continu-
ous culture. Science (New York, N.Y.), 193(4254):6735, Aug. 1976.
ISSN 0036-8075. URL http://www.ncbi.nlm.nih.gov/pubmed/
781840.
[195] A. Z. Tremp, F. S. Al-Khattaf, and J. T. Dessens. Distinct
temporal recruitment of Plasmodium alveolins to the subpel-
licular network. Parasitology research, 113(11):417788, Nov.
2014. ISSN 1432-1955. doi: 10.1007/s00436-014-4093-4. URL
references 219
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=4200347&tool=pmcentrez&rendertype=abstract.
[196] M. Tufet-Bayona, C. J. Janse, S. M. Khan, A. P. Waters, R. E.
Sinden, and B. Franke-Fayard. Localisation and timing of ex-
pression of putative Plasmodium berghei rhoptry proteins in
merozoites and sporozoites. Molecular and biochemical parasitol-
ogy, 166(1):2231, July 2009. ISSN 1872-9428. doi: 10.1016/j.
molbiopara.2009.02.009. URL http://www.ncbi.nlm.nih.gov/
pubmed/19428669.
[197] P. Uzureau, J.-C. Barale, C. J. Janse, A. P. Waters, and C. B.
Breton. Gene targeting demonstrates that the Plasmodium
berghei subtilisin PbSUB2 is essential for red cell invasion
and reveals spontaneous genetic recombination events. Cellu-
lar microbiology, 6(1):6578, Jan. 2004. ISSN 1462-5814. URL
http://www.ncbi.nlm.nih.gov/pubmed/14678331.
[198] M. R. van Dijk, B. C. L. van Schaijk, S. M. Khan, M. W.
van Dooren, J. Ramesar, S. Kaczanowski, G.-J. van Gemert,
H. Kroeze, H. G. Stunnenberg, W. M. Eling, R. W. Sauer-
wein, A. P. Waters, and C. J. Janse. Three members of
the 6-cys protein family of Plasmodium play a role in
gamete fertility. PLoS pathogens, 6(4):e1000853, Apr. 2010.
ISSN 1553-7374. doi: 10.1371/journal.ppat.1000853. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=2851734&tool=pmcentrez&rendertype=abstract.
[199] A. v.d Wel, C. H. Kocken, T. C. Pronk, B. Franke-Fayard, and
A. W. Thomas. New selectable markers and single crossover
integration for the highly versatile Plasmodium knowlesi trans-
fection system. Molecular and Biochemical Parasitology, 134(1):
97104, Mar. 2004. ISSN 01666851. doi: 10.1016/j.molbiopara.
2003.10.019. URL http://www.sciencedirect.com/science/
article/pii/S0166685103003311.
[200] S. Vinayak, M. T. Alam, T. Mixson-Hayden, A. M. McCollum,
R. Sem, N. K. Shah, P. Lim, S. Muth, W. O. Rogers, T. Fandeur,
J. W. Barnwell, A. A. Escalante, C. Wongsrichanalai, F. Ariey,
S. R. Meshnick, and V. Udhayakumar. Origin and evolu-
tion of sulfadoxine resistant Plasmodium falciparum. PLoS
pathogens, 6(3):e1000830, Mar. 2010. ISSN 1553-7374. doi:
10.1371/journal.ppat.1000830. URL http://dx.plos.org/10.
1371/journal.ppat.1000830.
[201] J. C. Wagner, R. J. Platt, S. J. Goldfless, F. Zhang, and J. C. Niles.
Efficient CRISPR-Cas9-mediated genome editing in Plasmod-
ium falciparum. Nature Methods, 11(August):16, 2014. ISSN
1548-7091. doi: 10.1038/nmeth.3063. URL http://www.nature.
com/doifinder/10.1038/nmeth.3063.
220 references
[202] Y. Wang, L. Y. Geer, C. Chappey, J. A. Kans, and S. H. Bryant.
Cn3D: sequence and structure views for Entrez. Trends in bio-
chemical sciences, 25(6):3002, June 2000. ISSN 0968-0004. URL
http://www.ncbi.nlm.nih.gov/pubmed/10838572.
[203] Wel. New selectable markers and single crossover integration
for the highly versatile Plasmodium knowlesi transfection sys-
tem. - PubMed - NCBI. URL http://www.ncbi.nlm.nih.gov/
pubmed/14747147.
[204] S. P. Wertheimer and A. W. Barnwell. Plasmodium vivax inter-
action with the Human Duffy Blood Group Glycoprotein : Iden-
tification of a Parasite Receptor-like Protein sera against the
Fya Duffy determinant and proteolytic cleavage of the Duffy
determi- nants from erythrocytes to inhibit P . ( . 350:340350,
1989.
[205] T. William, H. a. Rahman, J. Jelip, M. Y. Ibrahim, J. Menon, M. J.
Grigg, T. W. Yeo, N. M. Anstey, and B. E. Barber. Increasing
Incidence of Plasmodium knowlesi Malaria following Control
of P. falciparum and P. vivax Malaria in Sabah, Malaysia. PLoS
neglected tropical diseases, 7(1):e2026, Jan. 2013. ISSN 1935-2735.
doi: 10.1371/journal.pntd.0002026. URL http://www.ncbi.nlm.
nih.gov/pubmed/23359830.
[206] C. C. Wirth and G. Pradel. Molecular mechanisms of host
cell egress by malaria parasites. International journal of medi-
cal microbiology : IJMM, 302(4-5):1728, Oct. 2012. ISSN 1618-
0607. doi: 10.1016/j.ijmm.2012.07.003. URL http://www.ncbi.
nlm.nih.gov/pubmed/22951233.
[207] G. J. Wright and J. C. Rayner. Plasmodium falciparum
erythrocyte invasion: combining function with immune
evasion. PLoS pathogens, 10(3):e1003943, Mar. 2014. ISSN
1553-7374. doi: 10.1371/journal.ppat.1003943. URL
http://www.pubmedcentral.nih.gov/articlerender.fcgi?
artid=3961354&tool=pmcentrez&rendertype=abstract.
[208] Y. Wu, C. D. Sifri, H. H. Lei, X. Z. Su, and T. E. Wellems. Trans-
fection of Plasmodium falciparum within human red blood
cells. Proceedings of the National Academy of Sciences, 92(4):973
977, Feb. 1995. ISSN 0027-8424. doi: 10.1073/pnas.92.4.973. URL
http://www.pnas.org/content/92/4/973.abstract.
[209] S. Yeoh, R. A. O’Donnell, K. Koussis, A. R. Dluzewski, K. H.
Ansell, S. A. Osborne, F. Hackett, C. Withers-Martinez, G. H.
Mitchell, L. H. Bannister, J. S. Bryans, C. A. Kettleborough,
and M. J. Blackman. Subcellular discharge of a serine pro-
tease mediates release of invasive malaria parasites from host
erythrocytes. Cell, 131(6):107283, Dec. 2007. ISSN 0092-8674.
doi: 10.1016/j.cell.2007.10.049. URL http://www.ncbi.nlm.nih.
gov/pubmed/18083098.
references 221
[210] M. Yuda, S. Iwanaga, S. Shigenobu, G. R. Mair, C. J. Janse,
A. P. Waters, T. Kato, and I. Kaneko. Identification of a tran-
scription factor in the mosquito-invasive stage of malaria par-
asites. Molecular microbiology, 71(6):140214, Mar. 2009. ISSN
1365-2958. doi: 10.1111/j.1365-2958.2009.06609.x. URL http:
//www.ncbi.nlm.nih.gov/pubmed/19220746.
[211] N. A. Yusuf, J. L. Green, R. J. Wall, E. Knuepfer, R. W. Moon,
C. Schulte-Huxel, R. R. Stanway, S. R. Martin, S. A. Howell,
C. H. Douse, E. Cota, E. W. Tate, R. Tewari, and A. A. Holder.
The Plasmodium Class XIV Myosin, MyoB, Has a Distinct Sub-
cellular Location in Invasive and Motile Stages of the Malaria
Parasite and an Unusual Light Chain. The Journal of biological
chemistry, 290(19):1214764, May 2015. ISSN 1083-351X. doi:
10.1074/jbc.M115.637694. URL http://www.jbc.org/content/
290/19/12147.long.
[212] C. Zhang, B. Xiao, Y. Jiang, Y. Zhao, Z. Li, H. Gao, Y. Ling,
J. Wei, S. Li, M. Lu, X.-Z. Su, H. Cui, and J. Yuan. Effi-
cient Editing of Malaria Parasite Genome Using the CRISPR/-
Cas9 System. mBio, 5(4):19, 2014. ISSN 2150-7511. doi:
10.1128/mBio.01414-14. URL http://www.ncbi.nlm.nih.gov/
pubmed/24987097.
[213] Y. Zhang, F. Buchholz, J. P. Muyrers, and A. F. Stewart. A
new logic for DNA engineering using recombination in Es-
cherichia coli. Nature genetics, 20(2):1238, Oct. 1998. ISSN 1061-
4036. doi: 10.1038/2417. URL http://www.ncbi.nlm.nih.gov/
pubmed/9771703.
[214] E. S. Zuccala, A. M. Gout, C. Dekiwadia, D. S. Marapana, F. An-
grisano, L. Turnbull, D. T. Riglar, K. L. Rogers, C. B. Whitchurch,
S. A. Ralph, T. P. Speed, and J. Baum. Subcompartmental-
isation of proteins in the rhoptries correlates with ordered
events of erythrocyte invasion by the blood stage malaria par-
asite. PloS one, 7(9):e46160, Jan. 2012. ISSN 1932-6203. doi: 10.
1371/journal.pone.0046160. URL http://journals.plos.org/
plosone/article?id=10.1371/journal.pone.0046160#s4.
Part IV
A P P E N D I X
A
TAB L E S
a.1 plasmogem vectors used in this work
gene vector gene vector
PBANKA_010260 PbGEM-327571 PBANKA_100170 PbGEM-292664
PBANKA_010290 PbGEM-005512 PBANKA_100240 PbGEM-034599
PBANKA_010630 PbGEM-327723 PBANKA_100250 PbGEM-292784
PBANKA_011160 PbGEM-267395 PBANKA_100360 PbGEM-264676
PBANKA_020460 PbGEM-082095 PBANKA_100530 PbGEM-333507
PBANKA_021450 PbGEM-269163 PBANKA_100550 PbGEM-243822
PBANKA_030110 PbGEM-269371 PBANKA_100920 PbGEM-293808
PBANKA_030450 PbGEM-328427 PBANKA_101040 PbGEM-333691
PBANKA_030670 PbGEM-009687 PBANKA_101910 PbGEM-263388
PBANKA_030950 PbGEM-270402 PBANKA_102010 PbGEM-096670
PBANKA_031140 PbGEM-111762 PBANKA_102760 PbGEM-296182
PBANKA_031480 PbGEM-010745 PBANKA_103170 PbGEM-334339
PBANKA_031660 PbGEM-328723 PBANKA_103210 PbGEM-334355
PBANKA_040180 PbGEM-230686 PBANKA_103540 PbGEM-039414
PBANKA_040640 PbGEM-328923 PBANKA_103970 PbGEM-297582
PBANKA_040690 PbGEM-271986 PBANKA_110140 PbGEM-334659
PBANKA_041600 PbGEM-085062 PBANKA_110330 PbGEM-298173
PBANKA_041780 PbGEM-273282 PBANKA_110650 PbGEM-098459
PBANKA_050180 PbGEM-273642 PBANKA_110690 PbGEM-098495
PBANKA_050410 PbGEM-273986 PBANKA_110760 PbGEM-265060
PBANKA_050690 PbGEM-230990 PBANKA_111530 PbGEM-121282
PBANKA_050720 PbGEM-274330 PBANKA_112010 PbGEM-335251
PBANKA_051140 PbGEM-231569 PBANKA_112390 PbGEM-247136
PBANKA_051390 PbGEM-329643 PBANKA_112410 PbGEM-300723
PBANKA_051520 PbGEM-086151 PBANKA_112810 PbGEM-099883
PBANKA_051535 PbGEM-232086 PBANKA_113780 PbGEM-100665
PBANKA_052170 PbGEM-329843 PBANKA_113880 PbGEM-335867
PBANKA_061970 PbGEM-330355 PBANKA_120180 PbGEM-303714
PBANKA_062240 PbGEM-088353 PBANKA_120200 PbGEM-046708
PBANKA_070200 PbGEM-330539 PBANKA_120800 PbGEM-101938
PBANKA_071270 PbGEM-330859 PBANKA_121060 PbGEM-304938
225
226 references
gene vector gene vector
PBANKA_071310 PbGEM-330875 PBANKA_121180 PbGEM-336371
PBANKA_071660 PbGEM-282090 PBANKA_121440 PbGEM-336467
PBANKA_072070 PbGEM-331075 PBANKA_122440 PbGEM-120990
PBANKA_072150 PbGEM-237247 PBANKA_122540 PbGEM-050063
PBANKA_080130 PbGEM-331195 PBANKA_122670 PbGEM-265652
PBANKA_080220 PbGEM-023655 PBANKA_122740 PbGEM-050376
PBANKA_080250 PbGEM-283042 PBANKA_122760 PbGEM-252519
PBANKA_080450 PbGEM-283290 PBANKA_123730 PbGEM-103704
PBANKA_080960 PbGEM-237098 PBANKA_124060 PbGEM-103910
PBANKA_081510 PbGEM-284458 PBANKA_130340 PbGEM-310399
PBANKA_081700 PbGEM-238732 PBANKA_130430 PbGEM-337499
PBANKA_082160 PbGEM-285114 PBANKA_130520 PbGEM-053796
PBANKA_082490 PbGEM-026600 PBANKA_131280 PbGEM-337723
PBANKA_082870 PbGEM-027131 PBANKA_131420 PbGEM-311741
PBANKA_083100 PbGEM-259972 PBANKA_131540 PbGEM-265356
PBANKA_083290 PbGEM-286410 PBANKA_133270 PbGEM-250003
PBANKA_083300 PbGEM-286426 PBANKA_133460 PbGEM-265364
PBANKA_083560 PbGEM-028140 PBANKA_133580 PbGEM-058012
PBANKA_090260 PbGEM-332267 PBANKA_133660 PbGEM-058133
PBANKA_090380 PbGEM-111794 PBANKA_133890 PbGEM-107133
PBANKA_090590 PbGEM-260788 PBANKA_134490 PbGEM-338675
PBANKA_090610 PbGEM-332331 PBANKA_134640 PbGEM-316019
PBANKA_090840 PbGEM-288058 PBANKA_134700 PbGEM-338747
PBANKA_090860 PbGEM-288074 PBANKA_135570 PbGEM-317163
PBANKA_091030 PbGEM-029684 PBANKA_135610 PbGEM-317179
PBANKA_091150 PbGEM-332491 PBANKA_135740 PbGEM-317307
PBANKA_091170 PbGEM-029891 PBANKA_136440 PbGEM-109041
PBANKA_091240 PbGEM-332523 PBANKA_140920 PbGEM-319715
PBANKA_091510 PbGEM-332595 PBANKA_141200 PbGEM-339691
PBANKA_091940 PbGEM-332699 PBANKA_141380 PbGEM-320219
PBANKA_092090 PbGEM-240365 PBANKA_141980 PbGEM-256665
PBANKA_092170 PbGEM-332739 PBANKA_142310 PbGEM-065482
PBANKA_092250 PbGEM-289713 PBANKA_143360 PbGEM-322619
PBANKA_092510 PbGEM-289985 PBANKA_143910 PbGEM-323355
PBANKA_092560 PbGEM-290017 PBANKA_144290 PbGEM-340475
PBANKA_092940 PbGEM-290680 PBANKA_144720 PbGEM-340635
PBANKA_092970 PbGEM-238659 PBANKA_145700 PbGEM-264860
PBANKA_093190 PbGEM-290936 PBANKA_145870 PbGEM-325915
references 227
gene vector gene vector
PBANKA_093200 PbGEM-333019 PBANKA_145910 PbGEM-258884
PBANKA_093210 PbGEM-239370 PBANKA_145950 PbGEM-325987
PBANKA_093690 PbGEM-240525 PBANKA_146060 PbGEM-071005
PBANKA_100010 PbGEM-333323
228 references
Table S2: DNA oligos used in these studies. Note that other oligos for Plas-
moGEM vectors are not listed since these are displayed on the
PlasmoGEM website.
Oligo Sequence
arg91 TCGGCATTCCTGCTGAACCGCTCTTCCGAT
CTGTAATTCGTGCGCGTCAG
arg97 ACACTCTTTCCCTACACGACGCTCTTCCGA
TCTCCTTCAATTTCGATGGGTAC
1306400_del CGTGCAGCAAAGGAATCAGT
1306400_del AATCGACAAGGCTCCCATGA
0941100_del GGAGGGGTAAACAAGTGGGA
0941100_del CGTATGACTCGGACGGACAT
1207600_del CCCGCTATGCTCTCCAAAAC
1207600_del GGACGAAGGGGAGTGGATAG
0918700_del GGAGCGATGAGAAGAGAGCT
0918700_del TCGAATGCCCACCTTCTCTT
pkg_del ctatcgcccacctgcatttc
pkg_del tccgttcactggtaccgtac
OligoDNAJfw attgCCAGGCACGGGCACTTCCAG
OligoDNAJrev aaacCACCATATAGCGAACAAGTGGGAC
TGL58int TCGCCCACGTAAACTGCGCA
TGL59int CCCACCCACCCTCTTGCAGG
TGL60int AATTCGGTGTAGGGCGGCGC
TGL61int CGGTGAGGGCGCAGAAGTGG
TGL62int TGCCCACCTGAAGAACAGTGAGC
TGL63int CCAGGCACGGGCACTTCCAG
TGL64int GCCTACTGAGCGACCGCGTC
TGL65int TGCGCGTGGTCTTCTCACGG
TGL66int TCTGTGCAGGCGAGCATGGC
pkhsp70L tggagtgtagccatccgaat
pkhsp70R atatgtgtactccccagcgc
dhfrL ttcgctaaactgcatcgtcg
dhfrR tgggtgattcatggcttcct
TPR458 TTTTAAGTGTAGTTAATTCATCAAATAGCATGCCTGC
AGGTCGACTCTAGAGGATCC-
GAAAATGGGTAAAAAAAAAAAAAAAA-
GAAAAGAGAACATG
TPR459 agttgataacggactagccttattttaacttgctaTTT Ctagctc-
taaaacagGTCTTCtcGAAGACccCAATAATAT-
ACTGTAACTCAGAATA
B
S U P P L E M E N TA RY F I G U R E S
Figure S1: AP2-O expression levels in P. falciparum homolog by stage [163]
229
230 references
12,332 bp
Figure S2: Map of Crispr/Cas9 vector expressing guide RNA, from the U6
promoter, yDHODH resistance marker and Cas9 protein. BbsI
sites are digested for insertion of the guide.
references 231
B
A
Slow Redundant Essential
Essential Redundant Slow
Figure S3: This is a Cohen-Friendly association plot demonstrating the rela-
tionship between the phenotypes on either end of a PlasmoINT
connection. The width of each cell represents the expected size of
the cell and the height the signed difference between observed
and expected. The data for this figure is combined from the
barseq experiment and PhenoPlasm.
C
R AW B A R S E Q D ATA
The following pages display, for the sake of completeness, the raw
barcode-ratios for barseq experiments. In due course these data will
also be available at http://plasmogem.sanger.ac.uk in an interactive
format.
233
PBANKA_031480
PBANKA_040820
PBANKA_060550
PBANKA_082870
PBANKA_112430
PBANKA_124430
PBANKA_131480
PBANKA_134170
PBANKA_136210
PBANKA_140160
PBANKA_142540
PBANKA_142880
0.000
0.005
0.010
0.015
0.020
0.025
0.00
0.01
0.02
0.03
0.04
0.05
0.0000
0.0005
0.0010
0.0015
0.0020
0.0000
0.0025
0.0050
0.0075
0.003
0.006
0.009
0.07
0.09
0.11
0.000
0.002
0.004
0.006
0.006
0.008
0.010
0.012
0.0000
0.0005
0.0010
0.0015
0.00
0.05
0.10
0.15
0.09
0.11
0.13
0.000
0.001
0.002
0.003
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
234
PBANKA_051820
PBANKA_052140
PBANKA_052240
PBANKA_093240
PBANKA_101290
PBANKA_101330
PBANKA_103510
PBANKA_103780
PBANKA_110520
PBANKA_114490
PBANKA_123730
PBANKA_141580
0.000
0.003
0.006
0.009
0.000
0.001
0.002
0.003
0.004
0.05
0.06
0.07
0.08
0.09
0.000
0.001
0.002
0.003
0.000
0.002
0.004
0.000
0.002
0.004
0.000
0.001
0.002
0.003
0.000
0.005
0.010
0.000
0.002
0.004
0.006
0.008
0.0000
0.0025
0.0050
0.0075
0.0100
0.0125
0.0000
0.0005
0.0010
0.0015
0.0000
0.0025
0.0050
0.0075
0.0100
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
235
PBANKA_020460
PBANKA_051500
PBANKA_060530
PBANKA_092520
PBANKA_110420
PBANKA_111190
PBANKA_111470
PBANKA_113780
PBANKA_113870
PBANKA_133760
PBANKA_144040
PBANKA_146060
0.000
0.002
0.004
0.006
0.02
0.04
0.06
0.000
0.001
0.002
0.003
0.004
0.000
0.002
0.004
0.006
0.008
0.01
0.02
0.03
0.010
0.015
0.020
0.025
0.05
0.06
0.07
0.00
0.01
0.02
0.02
0.03
0.04
0.05
0.005
0.010
0.015
0.020
0.025
0.000
0.001
0.002
0.003
0.004
0.03
0.04
0.05
0.06
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
236
PBANKA_030670
PBANKA_031420
PBANKA_051110
PBANKA_051160
PBANKA_051520
PBANKA_052170
PBANKA_061910
PBANKA_080340
PBANKA_093980
PBANKA_101520
PBANKA_121780
PBANKA_130920
0e+00
1e−04
2e−04
3e−04
4e−04
5e−04
0.000
0.005
0.010
0.015
0.01
0.02
0.03
0.04
0.0000
0.0005
0.0010
0.0015
0.0020
0.000
0.001
0.002
0.003
0.004
0.000
0.005
0.010
0.015
0.000
0.001
0.002
0.003
0.004
0.010
0.015
0.020
0.025
0.030
0.05
0.06
0.07
0.08
0.000
0.002
0.004
0.006
0e+00
3e−04
6e−04
9e−04
0.12
0.13
0.14
0.15
0.16
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
237
PBANKA_000440
PBANKA_040180
PBANKA_071750
PBANKA_080090
PBANKA_082520
PBANKA_110690
PBANKA_112010
PBANKA_113880
PBANKA_124460
PBANKA_130380
PBANKA_135570
PBANKA_144610
0.00000
0.00025
0.00050
0.00075
0.000
0.002
0.004
0.006
0.005
0.006
0.007
0.008
0.009
0.010
0.000
0.004
0.008
0.012
0.004
0.006
0.008
0.010
0.000
0.001
0.002
0.003
0.004
0.005
0.012
0.016
0.020
0.000
0.005
0.010
0.015
0.01
0.02
0.03
0.04
0.0000
0.0005
0.0010
0.0015
0.0020
0.0025
0.0050
0.0075
0.0100
0.0000
0.0005
0.0010
0.0015
0.0020
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
238
PBANKA_040710
PBANKA_050720
PBANKA_061360
PBANKA_080450
PBANKA_090610
PBANKA_090930
PBANKA_091890
PBANKA_100250
PBANKA_130640
PBANKA_131420
PBANKA_141910
PBANKA_144720
0.0025
0.0050
0.0075
0.0025
0.0050
0.0075
0.015
0.020
0.025
0.030
0.005
0.007
0.009
0.011
0.005
0.006
0.007
0.008
0.009
0.000
0.002
0.004
0.006
0.008
0.008
0.010
0.012
0.014
0.0100
0.0125
0.03
0.04
0.05
0.06
0.0000
0.0005
0.0010
0.0015
0.0020
0.0100
0.0125
0.0150
0.0175
0.000
0.001
0.002
0.003
0.004
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
239
PBANKA_011160
PBANKA_050180
PBANKA_050690
PBANKA_081510
PBANKA_092160
PBANKA_092940
PBANKA_121060
PBANKA_121330
PBANKA_133040
PBANKA_134000
PBANKA_143270
PBANKA_146350
0.000
0.001
0.002
0.0000
0.0025
0.0050
0.0075
0.0100
0.0125
0.002
0.004
0.006
0.008
0.005
0.010
0.0000
0.0025
0.0050
0.0075
0.0100
0.0025
0.0050
0.0075
0.0100
0.0125
0.000
0.005
0.010
0.006
0.007
0.008
0.009
0.010
0.0000
0.0025
0.0050
0.0075
0.0100
0.009
0.012
0.015
0.018
0.004
0.008
0.012
0.016
0.000
0.001
0.002
0.003
0.004
0.005
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
240
PBANKA_061510
PBANKA_071660
PBANKA_081240
PBANKA_083300
PBANKA_083380
PBANKA_093190
PBANKA_093200
PBANKA_120350
PBANKA_131280
PBANKA_135740
PBANKA_144160
PBANKA_145950
0.04
0.06
0.08
0.10
0.12
0.004
0.005
0.006
0.001
0.002
0.003
0.004
0.04
0.05
0.06
0.07
0.000
0.002
0.004
0.006
0.0000
0.0025
0.0050
0.0075
0.0100
0e+00
2e−04
4e−04
6e−04
8e−04
0.005
0.010
0.015
0.000
0.001
0.002
0.003
0.010
0.011
0.012
0.013
0.014
0.024
0.028
0.032
0.036
0.0000
0.0025
0.0050
0.0075
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
241
PBANKA_061580
PBANKA_081760
PBANKA_082160
PBANKA_090260
PBANKA_092510
PBANKA_093690
PBANKA_103170
PBANKA_111980
PBANKA_122980
PBANKA_132890
PBANKA_135610
PBANKA_140440
0.001
0.002
0.003
0.000
0.002
0.004
0.006
0.008
0.000
0.001
0.002
0.003
0.004
0.000
0.002
0.004
0.006
0.000
0.002
0.004
0.006
0.000
0.002
0.004
0.006
0.008
0.0175
0.0200
0.0225
0.000
0.001
0.002
0.015
0.020
0.025
0.030
0.000
0.003
0.006
0.009
0.012
0.006
0.008
0.010
0.000
0.001
0.002
0.003
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
242
PBANKA_021450
PBANKA_040280
PBANKA_040640
PBANKA_040840
PBANKA_091150
PBANKA_091510
PBANKA_092090
PBANKA_092900
PBANKA_112140
PBANKA_113980
PBANKA_121970
PBANKA_142000
0.0000
0.0025
0.0050
0.0075
0.0100
0.000
0.001
0.002
0.003
0.000
0.001
0.002
0.003
0.004
0.005
0.002
0.003
0.004
0.005
0.006
0.008
0.010
0.015
0.020
0.025
0.030
0.000
0.002
0.004
0.006
0.008
0.0025
0.0050
0.0075
0.005
0.010
0.015
0.005
0.007
0.009
0.011
0.015
0.020
0.025
0.030
0.0005
0.0010
0.0015
0.0020
0.0025
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
243
PBANKA_030110
PBANKA_050410
PBANKA_051360
PBANKA_051535
PBANKA_062250
PBANKA_091840
PBANKA_111160
PBANKA_111620
PBANKA_121440
PBANKA_142090
PBANKA_142790
PBANKA_144100
0.004
0.006
0.008
0.010
0.012
0.02
0.03
0.04
0.05
0.000
0.001
0.002
0.003
0.003
0.005
0.007
0.007
0.009
0.011
0.013
0.020
0.025
0.030
0.012
0.016
0.020
0.024
0.000
0.001
0.002
0.003
0.004
0.000
0.002
0.004
0.006
0.008
0.004
0.005
0.006
0.007
0.008
0.009
0.003
0.004
0.005
0.006
0.0000
0.0025
0.0050
0.0075
0.0100
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
244
PBANKA_041700
PBANKA_070200
PBANKA_082410
PBANKA_092560
PBANKA_103010
PBANKA_103440
PBANKA_112870
PBANKA_130430
PBANKA_131540
PBANKA_142720
PBANKA_144110
PBANKA_145870
0.000
0.001
0.002
0.003
0.004
0.005
0.002
0.003
0.004
0.005
0.000
0.001
0.002
0.06
0.07
0.08
0.09
0.000
0.001
0.002
0.003
0.004
0.005
0.010
0.015
0.020
0.003
0.004
0.005
0.006
0.0100
0.0125
0.0150
0.000
0.001
0.002
0.003
0.000
0.001
0.002
0.003
0.006
0.007
0.008
0.009
0.000
0.005
0.010
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
245
p230p−tag
PBANKA_061040
PBANKA_081430
PBANKA_092250
PBANKA_100920
PBANKA_112390
PBANKA_121180
PBANKA_122760
PBANKA_142220
0e+00
5e−04
1e−03
0.000
0.002
0.004
0.006
0.008
0.012
0.016
0.020
0.000
0.001
0.002
0.003
0.001
0.002
0.003
0.01
0.02
0.03
0.030
0.035
0.040
0.0000
0.0025
0.0050
0.0075
0.0100
0.03
0.04
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
4 5 6 7 8 4 5 6 7 8 4 5 6 7 8
Day
Barcode ratio
Mouse
1
2
3
246