Saturday, December 12, 2009

The complete transcript of Dr.Paul Keim's testimony before the NAS, Sept 24 2009

Dr. Paul Keim and his research group developed the assays used to distinguish the Ames strain of Bacillus anthracis from other closely related strains. These assays played a critical role in defining the crime scene in the 2001 anthrax attacks, and a survey of the research papers produced by this group indicates a very thorough approach to the subject, as you'll see if you read the transcript.

However, he was not directly involved in the later work regarding 'morphs' - subvariants of the Ames strain.

This testimony is highly technical, but I'll try to parse it out later.

The original audio that this was transcribed from is available at the NAS website:
http://www.nationalacademies.org/newsroom/nalerts/20090925.html

Slides were shown with this presentation, but unfortunately the NAS did not see fit to post them online.

Transcript begins:

Dr. Paul Keim from Northern Arizona University

So, unlike the other speakers, I of course delivered my speech by slides five minutes ago, so we'll see if they actually come up alright.

Thanks very much for invinting me, I will try to encapsulate eight years of hard work into thirty minutes, so if it feels like you're flying at 40,000 feet, it's close - because the details, we've tried to publish, and most of the details of the analysis - not all, but most have actually appeared in the peer-reviewed literature already, and I have a long list of reading material for you to do at home. My understanding is you already have quite a bit of material to read.

So what I intend to do today is to go over what the Ames strain is, what we know about it, how we developed methods for detecting it and understanding it, a little bit about the history of that work, our involvement in the investigation, talk about the validation of assays and how those assays were used in the investigation of this crime.

So we were involved from right at the very beginning - this is a sample that came to my laboratory on a Thursday evening - it was a culture that came out of the cerebrospinal fluid of Mr. Stevens, the first victim. This culture came into the laboratory at about eight o'clock at night, and we spent all night analyzing it with the best methods we had at the time, which we had been developing over several years under funding from the Department of Energy. The next morning we were able to actually call the CDC in Atlanta and the FBI and tell them that this in fact was something that matched the Ames strain.

This was based on a DNA fingerprinting method that involved eight different loci in the genome of Bacillus anthracis that were developed in a pre-genomics era - so this was hard wet-bench laboratory work, it was necessary to develop this technology, and by current standards it seems like the stone age.

So we got a match, but what does that mean? Because at the time, there were lots of issues, and these are issues that are still relevant today, but we've tried to put them at rest over the last eight years.

So, first off, we had to be able to distinguish a natural event from a nefarious event - and to do that, we needed to know if the Ames strain was common in nature, was it your everyday Bacillus anthracis, or anthrax, that was spread around North America and across the United States, or in fact was it rare? Did it have only a domestic distribution in the United States, or did it have in fact a foreign distribution? In fact, at the time when this occurred, we didn't even know where the Ames strain was. At least the vast majority of the scientific community thought it was from Ames Iowa, as we found out later it wasn't.

And then of course what we finally wanted to do with this type of analysis was attribute the material that was in the letters back to a source, and to do so we needed an understanding of exactly what the Ames strain was. So when we started we were doing research tools, and they definitely needed to be validated, much like Dr. Schutzer was just talking about. This validation is something you don't do on a Thursday afternoon after someone dies of anthrax, in fact it takes lots of time.

And finally of course we were very aware that this might end up in court, and in fact we were hopeful that the evidence that we were developing would end up in court, and so we were aware of the -inaudible- we were carrying, we had constant interactions with the law enforcement community as well as the legal community, so we were thinking about admissibility in court at the same time.

So, this is a dendrogram which represented the state of the knowledge in 2001, it was something that I'd published in the journal Bacteriology in 2000, it was a survey of some 400 Bacillus anthracis isolates from around the world - it contained our resolution at that point in time with these eight loci that we were studying was 89 different genotypes, so these were unique types that we were able to resolve out of about 400 different types.

When we came up with the identity being the Ames strain, in fact it appeared to be somewhat robust, because there was only one isolate in this 400 that in fact matched exactly to what we found out of the first victim, and that was the Ames strain, and so this became widely known as genotype 62 - which was not the intention of this paper. Genotype 62 was merely a reference to this particular dataset - but it took on a life of its own, and for many years - it took me a long time to get the scientific community to stop calling it genotype 62, since it was only relevant to this particular technology. It appeared to be pretty robust, but in fact when you looked closer at this - and again, you've got to remember that this was all across the news media, that it was the Ames strain - "they know its the Ames strain" - but again, we weren't really sure what the Ames strain was at that point in time.

And so what we found was if you go back and look at that same paper, we only had 84 isolates from North America, so out of the 419 total, only 84 were from North America - since this is a North American case, this is really the relevant set of isolates that were in this study. Most of those in fact were from Canada, and the Canadians have a lot of anthrax, so we had a lot of isolates from up there - only 32 then were from the United States. Of those 32 then, we had 16 unique types.

So, suddenly our power from 89 unique types shrinks, if you only consider it to be North American, and so we end up with only 16 unique types. So this is like rolling a few sets of dice and coming up with a match - and so you can see that in fact our power to draw the conclusion that the material in the letters as being the Ames strain was really much more limited than was probably being portrayed across the country at that time.

So we spent the next several years in fact building the databases to get more isolates from North America, and building the tools to get better resolution, so that we could distinguish one type of Bacillus anthracis from another. One of the important innovations here was we omovbed from a statistical approach for estimating or relationships to what I'll call a phylogenetic approach, which is really a logic approach.

Bacillus anthracis is a clonally propagated organism, we have very good papers and datasets showing this, and because of that then as mutations occur you end up with a heirarchical arrangement of mutations, so for example you have a mutation here which is nested inside a mutation there - you can then distinguish these things based on the orderly arrangement of mutations in a population. ANd so that's much of what we did over the last seven or eight years is establish these relationships so that we could identify a particular type.

This mutation for example could be used for distinguishing Bacillus anthracis from Bacillus cereus, this mutation could be used for example to identify the Ames strain from all other types of Bacillus anthracis, and that's what I'll show you now.

The regions of the genome that we had been focusing on before in 2001 really occurred in what we call the VNTR loci - these are very rapidly mutating regions of the genome, which were discoverable again in a pre-genomics era by wet bench type experimentation. What we wanted to move to was very slow-evolving but numerous types of variation, or identifiers that we'll call SNPs, or single nucleotide polymorphisms. So there's a whole range of mutation activity in a Bacillus anthracis genome, and we wanted to move from these types to this type.

As Rita was mentioning earlier, really the approach to getting there required whole genome sequencing. In a pre-genomics era, discovering a handful of SNPs across a 5MB genome was essentially impossible - and so it wasn't until we had access to whole-genome sequencing that we were able to do that, and that's what we are looking at now. And so in conjunction with Claire-Fraisure Liggett and Jack Ravell, we were able to identify very important SNPs for identifying the Ames strain from everything else.

Here's an example of the whole genome sequencing tree that came out of this - actually, this became affectionately known as the Rellman strategy, David Rellman wrote a review of our project in Science, and for some reason it took on the moniker of the Rellman strategy for many years after that - but the point is that we were able to sequence genomes and identify SNPs that were very specific - that we were able to identify the very rare and not very numerous SNPs that were able to distinguish the Ames strain from everything else.

With these SNPs in hand, Jack Ravell designed an Affymatrix chip for genotyping, the genotyping chip which ended up with almost 3000 usable loci on it - there's probably only about 6000 or 7000 SNPs in the entire species of Bacilllus anthracis and we ended up with about half of them on this chip.

From this then we are able to generate what is probably the most accurate phylogeny of any single species on earth, and that's Bacillus anthracis. So these are the data that were generated from about 136 different samples, and so we were able to , with very fine accuracy, arrange these isolates across this phylogenetic tree, which we like to think of as the population structure of Bacillus anthracis.

Now, 3000 SNPs is still a lot, and these Affy chips are still somewhat cumbersome to do, so we wanted to move to single SNP or a smaller number of SNP assays. And for that we developed this concept of what we call canonical SNPs - so on a very long branch for example where we might have a thousands SNPs, we would pick one, and we would canonize it, if you will, and canonize it, and use it as a marker for that particular place in the population structure.

With it then, we could use more rapid assays to categorize an unknown as to whether it was part of this population or part of this population. So this is just representing a reduction in the data analysis from a few thousand SNPs down to just a handful, maybe as few as 24, depending on what you question is.

Now, this is the Ames branch itself, each of the little marks along here is a particular SNP, these are the very closest relatives to the Ames strain itself. So the Ames strain is here - again, this genome sequence was generated at TIGR, and from that then we are able to identify the SNPS that are relevant to identifying the Ames strain, and its very closest relatives. Here you'll see some Texas isolates, and then over here you see Chinese, which are the most closely related outside of Texas. This is the Ames strain itself.

The important thing then is these SNPs right here, which we would call the typomorphic? SNPs or strain-specific SNPs for the Ames strain itself. Now, how do we come to the conclusion that they are very specific for the Ames strain? Well, we did a lot of validation involving a lot of different isolates. I'll show you that now.

The assays that we like to work with are real-time PCR assays involving dual-probe competition for the SNP site during the amplification - these curves then are different amounts, so this is one nanogram, this is 10 femtograms - 10 femtograms of DNA in a test tube is about five genome equivalents - and so these assays are actually sensitive to single molecules, and I'll show you some more data on that which is published.

So they are sensitive to single molecules, and they're sensitive to a single nucleotide difference. So the tools that we developed are at the theoretical maximum of what an assay can do - you can't go below one molecule. Actually, we can - we can go to half a molecule, since DNA is double stranded. You give us half a molecule, and we can identify the Ames strain accurately. You've got to put it in a test tube - we don't deal with that side of it - but if you give us the test tube, we can tell you whether it's the Ames strain or not.

We went through extensive validation. We calculated it up a few months ago, and it turned out we ended up doing over 50,000 PCR reactions. 50,000 in this validation study - we looked at magnesium concentrations, both high and low, we looked at inhibitors that you might find in blood or you might find in the environment, we changed the cycling parameters, we changed the reaction components, we looked a near bacterial relatives, we looked at environmental backgrounds, and we did low level detection type validation.

So these are the kinds of things we went through to see if we had an assay, would it ever give us a wrong answer. Would it ever tell us it was the Ames strain when it wasn't, and would it ever tell us it was not the Ames strain when it was? And what we found was there was only one set of conditions which ever did that for us, and that was if we really contrived and changed the conditions of the reaction - thus, if we left out the probe for the non-Ames, we would then see everything as looking like Ames - but otherwise, it was impossible to change the results. We could kill the reaction, but we couldn't change the final result. It would either tell us it was the Ames strain, or it wouldn't work at all.

Again, so the low-level detection, I just want to reiterate that again, here's the 10 femtograms, which is about one and a half genome equivalents, we did a study of about 5,000 different reactions that were done right at the one molecule level. And what was very satisfying was that the results matched a Poisson distribution, perfectly. So if we could get that molecule in the test tube, we would see the result, and if we couldn't get it in the test tube, we wouldn't see any result at all.

QUESTION: David - what kind of matrices were used in order to assess that level of detection?

So, you mean environmental matrices? Yeah, so we took - there's an environmental microbiologist who works in our group, actually works in our university, Edward Schwartzen, he had a set of 60 different DNAs that had been pulled out of about a dozen different soil types, and so in that case what we did is soil extractions and then we did spiking experiments to see if anything that would be left in there would be detected. We also looked for background, that is we looked to see if any of the DNAs of the microbes he was looking at, which were total soil extractions, would give us a result. And we did in fact find some Bacillus cereus, non-Ames Bacillus cereus that would amplify, but nothing came up as Ames.

QUESTION: You're speaking of actual multiple different assays, are you not?

Yeah, and I'm almost at the point where I'm going to tell you what the assays were, but in fact there's a set of five assays, three that we consider the gold standard, so to speak, three of those are chromosomal, one is on PX02, one is on PX01. The PX01 assay is not 100% specific for Ames but it is a very good way to monitor for the PX01 plasmid. That particular assay involves about four other isolates from Texas as well.

QUESTION: And then will you mention how you use the results in conjunction with each other?

In conjunction? Well, they all agree 100%, so is that what you mean?

QUESTION: Well, in theory - but in practice...

Yeah so in practice, the assays do agree - I mean, the only one that would disagree would be the PX01 which is 99.7% specific for the Ames strain - there are three other isolates that are included in that. And so if we go back to the phylogenetic - this tree right here - so for example, these four SNPs are 100% specific, and we have real-time PCR assays for all four of these. Three of those are really good, and one of them doesn't detect down to a single molecule level, so generally we don't use that one, and then we have a PX02 assay that also appears to be 100% specific, and there is also a PX01 assay out here which includes these others. So that's the only one that would be incongruent, again, one of the assays wasn't quite as sensitive as the others in our validation test, so you could see it failing in situations where the other two didn't.

When you're at the Poisson level, when your down there sampling, you can also end up with one assay working and one not due to the sampling of the genome. We don't sample at the spore level, because we've extracted DNA and put in the DNA equivalent, and of course then you end up with a Poisson there. Does that kind of answer that question?

So the environmental background again, the inhibitors that would be there.

So these assays again were validated at the single molecule level, quite extensively so that they - these would be appropriate for environmental sampling and environmental testing. In addition, the first analysis was done by a graduate student under the supervision of a postdoc, and not a particularly good graduate student at that, and so it was important to get the personnel in place for this, and so one of the things that happened over the next few weeks and months - we had FBI-approved SOPs (standard operating procedures), we put into place chain-of-custody, we did proficiency testing on the technicians who were doing the analysis, we had a method for certifying technicians for microbial forensic analysis, instruments of course had extensive controls - probably more controls - I mean, on a 96-well plate we would have on the order of 16 real samples and everything else on that plate would be controls. We would do critical reagent testing before any forensic analysis was done, everything was witnessed and frequently there would be FBI agents witnessing, and then of course lots of blank controls during the analysis itself.

Rita kindly showed this dendrogram before, let me reiterate that these are canonical SNPs and so we have real-time PCR assays across this, again this is representative of the entire species of Bacillus anthracis. By doing a PCR reaction we are able to categorize unknowns into these particular areas and hence we could rapidly run through thousands of samples and tell you whether or not they were part of this group or part of this group or part of another group.

Here's the phylogenetic distribution of Bacillus anthracis across the world, here's that little tree I just showed you up here, there's 1033 samples in this particular analysis, as you can see a lot from North America, but also Europe, China, the Southern part of Africa, and so we know what the phylogenetic distribution across the landscape is. We've done extensive studes on this, and even since this was published in 2007 we've probably doubled the number of isolates that are in this map, especially in North America where we have much better access to material than in for example Russia, where we have almost none.

So let's zoom in on North America which is relevant for this. This is kind of the distribution of anthax across the United States in our collection back in 2001 and 2002. Hot spot up here in the upper Midwest, some over here in Nevada, and of course this cluster down here that proved to be the Ames strain, and then up here in the Canadian wood bison up there. That's kind of where anthrax occurs in North America. If you again look at the distribution of types, you see this very dominant blue type in Canada and the United States. That's what we call the western North America type, I'll tell you more about that in a second. And then there's a very thin sliver, I think it's right in here, which turns out to be the Ames strain group. So the Ames strain is in fact relatively rare in the United States, and only found naturally at least, in the southern part of Texas, along the Rio Grande river - more on that in just a second.

So this goes back to that 2000 850 SNP tree that we did with Jacque Revelle - as you can see, we've done a lot of work here in western North America, and again, the dot is the relevant population for the question that we were asking - if we blow that up, you can see that each is a different isolate that we've genotyped in this region, so again we know a lot about this western North America group, which was the blue group in that previous tree.

If we break that out, we find that there are a number of these nodes or separate genotypes within the western North America group - remember, this is nearly an identical genotype - so we are able to resolve, using subsequent assays, essentially a monomorphic type, essentially a number of different subtypes, importantly by using phylogenetic analysis as opposed to some other type of statistical analysis, we can assign ancestral nodes vs. derived nodes. That becomes very interesting when you look at the mapping of it across North America - it turns out the ancestral nodes map in the far north, up here in Canada, and then it becomes progressively more derived as it moves south.

So we propose a model for this type of Bacillus anthracis in North America involves a north-to-south migration, perhaps coming across Beringia during the last ice age, the Pleistocene or the early Holocene. So that's the type of Bacillus anthracis that's very common in North America, it's not the Ames strain. The Ames strain is down here in this group, very distinct and different - if we blow that up, we again see the Ames strain, we see these Texas isolates that are very closely related, and we see a whole group from China. So the Ames strain, or the Ames strain type, came from China, or that's a reasonable hypothesis, came into North America, and then somehow became established down here in the southern part of the United States, right in this region along here.

So there's a cline, or a separation, between the western North America strain and the Ames strain found down here, which dominates most of the isolates we have in our collection.

QUESTION: The slide that you showed with the circles, just a couple of slides back, that represents Ames or a variety of samples from the soil?

No, almost all of these isolates come out of - sorry - almost all these isolates come out of something that died. In fact, if you go into the environment and try to isolate Bacillus anthracis, even if you know something died there of anthrax, it's very hard.

QUESTION: If you go back to the global one you had, these are all either soil or animals that died, but they don't represent the breadth of Ames in the culture collections in the world?

No, not at all. In fact, this is supposed to represent a natural distribution, so if we know for example that you have the Ames strain in Porton Down, it would only count once. So if you go back to that original study in 2000, where we only found it once, we did not beef up our numbers, so to speak, by saying that we analyzed one from this lab and that lab and that lab - so if we knew that it came from a single progenitor, or at least we had reason to believe, we'd only count it once, and normalize the data.

QUESTION: Why, if Ames is not representative of the North American strains, how was it the strain that became used in the assault and testing?

Yeah, it's a historical happenstance, I mean, Pat can probably answer this better than I, but it came into USAMRIID at a time when they needed a highly virulent strain to replace the Vollum. The Vollum had been used as the vaccine challenge strain for years, you know it was just a time when they needed something that was working much better, and it really probably had more to do with its characteristics at USAMRIID in animal challenges and then the fact that they were the center that was really the leader - they were the scientific leaders in this area - so people started using it as the K12 of E. coli. It was a constant - it was the constant in all these other studies.

So that's the real reason - in fact I think there's some evidence that there are even hotter strains, more virulent strains out there, those data aren't particularly good - differential virulence in Bacillus anthracis is not a field I think has been studied as intensely as needed to be, but there are reports of more virulent strains and less virulent strains.

QUESTION: So is it correct to say that Ames is a strain that is used for testing and validation around the world, or only in the U.S.?

Well, the other laboratory which I know from personal experience that uses Ames a lot would be Porton Down. An after that, there was a lot of interest - in fact, prior to 2001 we were prepared to ship the Ames strain itself to a number of different laboratories, so that they could standardize their animal testing against what was going on in the U.S., and those laboratories decided they didn't want the Ames strain anymore, after 2001 - not that we would have been able to ship it anyway.

So, here's a blowup - I'm sure that you want to get to Pat, so I'll try to hurry along here. Here's a blowup of the region in Texas, the Ames strain itself came from right here, and that's down here, and then you see these other isolates. So when we realized it was the Ames strain, along with Alex Hoffmaster at the CDC, we made a concerted effort to go in and collect these. These aren't just like random strains that got mailed to me, pardon the pun, but they are strains that we had to go out and look for.

In other words to find the relevant population here, to be sure that we could identify the laboratory Ames strain from anything else, then we had to go out and look - and so what we have here, we're talking about a 5 MB genome, and so we have five SNPs here - that's it - five SNPs that differentiate the laboratory Ames from the isolates - and you know, if you're working with Vibrio, not only would these all be the same strain, probably you wouldn't differentiate the entire Bacillus anthracis as a different species at all. So the level of variation we're talking about here is unique in that it's so low. So again, five SNPs out of five megabases differentiate these, and the validation is that all of the Ames strains we've seen in the laboratory actually have these SNPs and are differentiated including the morphs when you get to that as well - and the morphs all contain these SNPs.

And again, this fits with evolutionary theory and dogma that we would expect that all of the derived strains from this carry those characteristics.

Okay - so the way that these SNPs or these assays were used in the investigation, after all that validation, was really to define the crime scene. So my laboratory - in 2001, the federal government was actually lacking in a forensics laboratory where they could also handle things that were BSL-3. So my laboratory became the, at least one of the repositories for the FBI for biosafety level three material that was also evidentiary. So as they collected evidence, for example from the mailbox in New Jersey, or from letters or spores that were collected in the AMI building, or from the Hart Building, all that material came to my laboratory, and we ended up with on the order of 2000 pieces of evidence being stored in our BSL-3 facility, all under chain of custody, in a way that would be admissible in court eventually. And all of that evidence was analyzed with these Ames SNPs to decide what was part of the crime scene, and what wasn't.

So, these spores that were coating the inside of this mailbox, in fact proved to be the Ames strain based upon those definitive SNPs, again we did a five panel assay on material like this, so that we got congruent results in all cases, the only ones again, David, that weren't congruent were the PX01, and those were only when we went to the Texas isolates.

So we defined the crime scene, and the crime scene was of course quite extensive - this is an FBI slide showing where the letters went - you've probably seen it before. Again, all of these were included with these assays, they all were part of that laboratory derived Ames culture, and different from anything seen in nature other than original isolation.

So this was all very important for defining the crime scene. But probably more important and less heralded was in fact our ability to exclude things from the crime scene. So in the last eight years we have numerous times gotten cases of natural outbreaks. The most, kind of one of the ones that was most important at the time was in November 2001 there were a number of cattle that died at the Hewlett Packard ranch outside of San Jose California, and this was a sample that was flown to us on a government jet, and we analyzed this one overnight as well, and it wasn't the Ames strain, it was something totally different. So what it meant then was that the FBI and law enforcement could not focus in on why cattle were dying near San Jose California, but rather focus back on the real crime scene. And again there were many examples of this, I won't go through them all - but another example would be the New York drummer, a very important result early on, I don't know if you remember this case, this is Mayor Bloomberg up here, and he seems to be quite concerned. But this gentleman was making drums from hides that came from Africa, we were quickly able to say it was not the Ames strain and was not part of what we knew - we said - was part of the crime scene. It might have been a different crime going on.

Again in 2001 I got a sample that had originally been collected by the UN, during the UNSCOM inspections of Iraq, and it was an isolate of Bacillus anthracis that came out of the weapons program that the Iraqis were doing in the 80s. And this proved not to be the Ames strain as well. That of course was very important for policy reasons and in the decisions that were being made as to where the crime might be was domestic or foreign. And there are other examples of exclusion that I won't go into.

I told you that I had a reading list for you, and I put this up here more just so that it's in the record for Erica, but we have since 2001 and of course more before, we published 43 different papers on this topic, on Bacillus anthracis, which doesn't include papers about plague or tularemia and so we publish on those as well - but there's a lot of papers. So it's out there, and the point is, in addition since 2001 I have given 120 public lectures on these same topics.

So, our work has been under peer review from the very beginning and that is an important part of the Daubert? critieria - if you are going to go to court, make it admissible. You've got to be out there and let the scientists take their crack at you - and that's something that we have been doing - and we have had to defend our work, and science, and modify our approach to things.

Out of this we came up with a paradigm for how we think you should approach forensic work, and it has to do with formulating hypothesis and then testing those hypothesis - you have to have population genetics to make it relevant, I'll just go back tothe example that Rita gave you before after we seqeuenced the Florida strain - we had a great sequence there, but we didn't know what to compare it to. It didn't mean a whole lot by itself, and we had to go back and develop these population databases - we got a paper in Science, but in reality we needed to do those population genetic studies to make it relevant. once you have that, then you can start to define specific hypothesis - you need to have the relevant population for comparing that, if you know if it's clonal or recombining you can come up with appropriate analysis for coming up with confidence estimation, and we definitely have confidence estimation built into our entire program. And eventually then you hope you come up with some clues or evidence that goes to court.

I want to finish by recognizing some people who were very important to this - Bruce Bedoulie, Mark Wilson, were critical, they were in my lab on a continuous basis. Jacque Revelle and I worked very closely together under the supervision of Claire. Albert Hoffmaster at the CDC has been important for gathering various collections, Beth George - if I was going to criticize the FBI, I will criticize the FBI, one of the biggest problems we had in the past eight years was in their contracting office.

In fact, we weren't able to get money from the FBI to do these analysis until May 2002, and instead it was Beth George and Pete Cintia at the Department of Energy who said, use our money, you've already got a contract in place, use our money and get this done. So I would say that for the next crisis it would be nice if the federal government had a couple of sugar grants out there - to get the money we had to do the work, because I still had to pay salaries, people had to go home and put gas in their car and feed their kids, and patriotism only gets you so far when you have to do things like that - so Beth George was a real hero in getting us the money - and Rita, as she pointed out, those sugar grants really got off the ground fast, and it was important to do.

So, questions now or questions later?

QUESTION: I should like to point out that um, the meetings that we had included speakers.

Right.

QUESTION: Paul Keim and others would present data, so, constant interaction.

Yeah, I was invited to those meetings many times, it's just that they were in Virginia and I live in Arizona. I did attend at least one of them, though.

QUESTION: It's not a forensic question, just a point of interest. So you don't detect anthracis in the environment? Whereas you detect other endospore forming species readily?

Yeah, but you pick one, and I'll say, find one strain and then let's make an assay for that strain, and you would have a hard time finding that one, too.

QUESTION: Well, subtilis is readily found.

Yeah, but subtilis is an incredibly diverse organism. So let's take one isolate from your laboratory, for example, and let's make an assay that is very specific for that isolate and let's go to the environment and see if we can find it, and I would say you would have a hard time finding it. You could find subtilis, but subtilis is kind of like E. coli, you know, it's really diverse. We're talking about - let's use the E. coli example. Bacillus anthracis is a clonal derivative of Bacillus cereus - it is less diverse as a species than E. coli 0157H sub, for example, at least I think it is. Rich may be able to correct me on this one. But that's an example of a very very defined clone that came out of E. coli, and it is still more diverse than Bacillus anthracis, and what we've done then is gone back and found a particular subclone within that. So Bacillus anthracis should probably never have been called a species. So we should call this Bacillus cereus subspecies anthracis.

But because it has such a dramatically different biology than Bacillus cereus - it causes catastrophic disease - classical microbiologists gave it species status - because of its really unique biology. But there is only a small number of genetic differences between it and Bacillus cereus. But your point is well taken - if you go out in the environment and look for something like cereus, you'll find it. You go out in the environment and look for Bacillus anthracis, I don't think you'll find it - and people have tried to do this in places where they think Bacillus anthracis should be. And BioWatch doesn't get any Bacillus anthracis hits, and they've been testing 10,000 samples per year for four or five years. So - it's a very defined, small set of bacteria.

QUESTION: Paul, if you were to, and I don't mean to put this the wrong way, but if you were to step back and say, critique your own work, where do you think the most work still remains to be done? Where are the needs the greatest? If this were to be taken to the next step, or applied tomorrow in the best possible way, what more would you like to see done, where?

Well, first off we were limited by technology - and over the last eight years technology has changed dramatically. Because we were working with Claire and had the support of the genomics community, I don't think that we were too far behind the curve - and I'm also a faculty member at TGEN which is also a genomics group. So we hopped on genomics as fast as we could. We didn't have the populations - we didn't have the samples - so even now, it would be better if we could go in and do a more extensive sampling of Texas and other areas. And the federal government has put all kinds of hurdles in our way - with the select agent rules, and the shipping - you know, we spend 40% of our time, Arturo, working on select agent rules. And that's the problem - it's hard to do this work, because of the regulatory constraints now. And what that means is you have to put twice as much effort into something to get anything back.

The other thing that happened when the select agent rules changed in 2002, there are a number of laboratories that destroyed their collections. The Texas State health labs, for example, destroyed hundreds of isolates that they collected over the years of Bacillus anthracis, and when they destroyed that we basically lost our forensically valid database of populations, and so - and I remember sitting in meetings where people in the federal government said, oh, they won't destroy their collections, they'll just ship them to you. No - that isn't what happened - they destroyed them. I'll name names later.

QUESTION: And I know you're sensitive about this issue - you just raised it - but how in your estimation, how representative are collections of B. anthracis um, elsewhere in the world? How representative is our notion of what is where around the globe?

It's hard to say. What we get is a snapshot in time. For example, what we have is an excellent collection from China - we have almost 200 isolates from China. But if you look at the dates, most of them were actually collected by one expedition to one province. Okay - so it's a snapshot of that one place in that one year. So we don't have good time series - other than maybe the United States right now. We don't have good time series - they don't go back very far - and they tend to focus on outbreaks. And again, we tried to compensate for that, but we don't always know - they have to be considered the best we've got, but also considered suspect at the same time.

QUESTION: You made the split early in your talk between natural and nefarious, and it seems as though you have knocked that one out of the park. Leaving aside any other evidence, it's very clear that this was not a natural outbreak- but part of our charge is related to - given that it was nefarious, the evidence line there, and I'm wondering whether you have any thoughts or
- what do you think we as a committee should be paying particular attention to?

I mean, it's easier to critique my own work than that of others - but what it comes down to, is the committee is going to have to look at the morphs and the frequency numbers very carefully. In my mind, I would really like to know whether those morphs are under selection under the conditions that were used to raise them - I suspect they are - but there are no experiments that I know of for that. Uh - I wonder if you would go out and do 35 batches of spores, at 10 liters, would you see that same repertoire of morphs again? I think you would. I think that those morphs are inevitable given the growth conditions that were there.

Now, that said, how many times has a batch of spores like RMR-1029 ever been produced? Well, I think it's only ever been produced once, you know, at least in that combination.

And, so those are the types of questions I should be asking, is about the selectability of those things.

Rita Colwell COMMENT: I'd like to emphasize that what needs to be done and what should be done is to continue the investment in the sequencing of the genomic work that the various agencies have funding - that funding should continue. It is really important to have multiple genome sequences of the given species, in order to understand the context

Yeah, so the original "Rellman strategy" was to sequence the most diverse isolates - of course, now we're at the other end of the spectrum and you have to convince people that you have to sequence things that are really really close together to tell them apart - so we'll call that the Keim strategy.

So, Rich, the other thing about that is you know, with the morphs, the competition experiments that you do, like with your populations, would be so straightforward and so relevant, you know, as you will or have heard, there are assays for those morphs, you can set up these growth experiments and do the competition and see how they perform versus the wild-type very easily, and I think that will give you a lot more insights to how you end up with that set of morphs inside of something like RMR-1029.

QUESTION: So Paul, earlier you said that Bacillus anthracis may not be a species but would be the tail end of Bacillus cereus, so you have analyzed 2000 strains, so it is pathogenicity as the one critieria that you have made as Bacillus anthracis, or how did you - what kind of markers did you come up with to say this is Bacillus anthracis?

Yeah, well, I got a multipart answer. First off, I would not propose changing the name Bacillus anthracis now. There are too many legal and regulatory complications to that, so we're going to leave it as a species because of the rest of the world. Scientifically, we know where it sits. Our analysis doesn't involve any pathogenicity - so we look at the pattern of the nucleotide variation inside of this, and we come up with a phylogenetic tree, and it fits the criteria for what we would call a monophyletic clade, so it's a single group - we can define SNPs or other markers that say, everything that contains these falls into this group, and legally, regulatory and traditionally we call that Bacillus anthracis.

QUESTION: So it is possible that these strains might not have PX01 and PX02?

Sure, absolutely. Many examples of strains that are now attenuated for lots of different reasons, besides the plasmids, and we would still call them Bacillus anthracis - but we would presume that they were virulent, or that their ancestors were virulent.

QUESTION: Part of our charge is to talk about validation of the methods used - I just feel that I have to ask the question. Early on, versus now - how much catching up did you have to do in validation?

Tremendous, tremendous. I mean, the level of validation that you do for a research paper that goes into Science - especially if it goes into Science, or the Journal of Bacteriology, is very different from what you do for forensics. Things that we absolutely knew were true, we still spent six months and a lot of money to prove. If you add EDTA, if you add humic acid, melony, all these things - you know, one of the objections - Bruce was sure, that if we went to the single molecule level, and we ran it lots of times, sooner or later a stochastic event would occur, that would give you the wrong result, and it never happened. The reason is the stochastic event would have been a polymerase mistake - I used to work in DNA replication - it would have been a polymerase mistake where it would have put in the wrong nucleotide - and that mistake we've measured, many times, and it would have occurred at 10^-5, 10^-6. So we would have had to run a million times to have seen that mistake - and we didn't, we only ran 5000 reactions. But even then, if we had put in a single-stranded piece of DNA - we are actually starting with two, not one - and so it would have looked like a heterozygote - and we've done lots of analysis on mixtures as well.

So, we didn't see it - but that was the type of scientific critique that we were getting and we were responsive to. And again, is someone were to tell you to prove that a polymerase makes a mistake at 10^-6 for a paper that you publish, you'd say forget it - but for this purpose we did it. We did three years on these assays, trying to break them, and all we could do was get them to fail altogether, we couldn't get them to give us the wrong answer.

QUESTION: The investigation obviously focused on mutations that could be correlated with -inaudible-. But there must be mutations that were silent - inaudible - can you think of -inaudible- comparison of these samples by deep sequencing could have -inaudible- the samples, or separated them?

Yeah, in defense - my role in the morphs is really in handling the material, the live material, and extracting DNA. We were quite busy doing other things at the time, so we weren't involved in assay development and validation - I assume you are going to hear from people who did that. So let's assume for a moment - this is not proven - that these morphs actually have a selective advantage under large growth, such as we saw. And so what that means is that if you repeat this experiment, you know, normally we like to compare results against a random model. So let's go out and repeat the experiment a thousand times so that we get a confidence estimation of point zero zero one. Well, I would guess that 99% of the time, you're going to see these morphs again, if you repeat this experiment the very same way.

So they're not a random event - I mean, the event itself is random but the numbers hear are very large - you're dealing with 10^12 or 10^15 spores, and you're dealing with two or more generations, so even rare events are gong to happen, predictably. The question is, why do they become such a large frequency? My guess is, they're actually under selection. So if instead you were to go to silent mutations, like you're suggesting, and there are doubtlessly - when you have numbers this big, every mutation you can imagine has occurred in that population. If instead you use a repertoire, and you can go to very much higher numbers than four, you use a repertoire of neutral and silent mutations, you would come up with something like a distinctive fingerprint for this type of a batch that would not be replicated inevitably if you repeat the experiment again. Rich actually knows more about this than I do, so you can ask him that in closed session.

QUESTION: So the answer is -inaudible- developed turned out to be indels, -inaudible-

Yeah, so I've hear Claire talk about this, so I'll paraphrase what I've heard her say in public. She said they didn't want to use SNPs because they weren't as stable as indels. And the reason she says that, is when you have an indel, and it's the right kind of indel, you've got to qualify that, that piece of DNA comes out of the chromosome and it's gone. So there's really no way for it to come back in - at least, that's a pretty valid assumption for certain types of indels - not all of them. The worry - the way she states the question is, if you have a SNP, it can mutate from a G to an A, it can mutate back. At least, that's the logic that she's used for not using SNPs. Now in a population context, and if in fact those SNPs are indeed related to a morph, or a phenotype that's under selection, you do worry that that selection could reverse and it could go back.

But in the total structure of Bacillus anthracis, we see very very few reversals. You know, we did a study that was published in 2004 in PNAS where we looked at, I think it was 1500 SNPs, and we saw, I think it was four reversals in the entire population - and it turned out those were amino acids and so there was probably some kind of selection going on there - for the most part, we don't see that happening. So you pick your SNPs right so they're really solid, maybe intergenic - they're unlikely to revert back - and plus, you could use many of them - hundreds, thousands, certainly you could do thousands. You could do a whole chip. Now the chip itself would not give you that type of information from RMR-1029, because in that case all those SNPs would have been fixed in the population - you are looking for new SNPs that occurred after the derivation of the Ames strain, in fact after the construction - during the construction of RMR-1029.

QUESTION: Did you say that anyone who took large, multi-liter batches and concentrated them over time, you would create selection pressure to create those morphs?

First off, I'm speculating, make sure that you understand that, I don't actually know - but I do wonder if in fact these morphs are under selection for growth conditions, such as growing ten-liter batches. So if you go back - it's not the concentration part, it's the growing the big batches, and the 35-some batches.

The indels - the other thing about these types of indels is they tend to occur at higher rates than SNP mutations. Most indels occur between directly repeated pieces of DNA or some other structure in the DNA. And in fact, Pat Worsham will talk - she may or may not talk about this, but she's actually studied one locus that creates and asporogenic morphotype that seems to happen - you can see it more than once. And so indels can happen at a higher frequency due to genome structural things - and so they occur at higher frequencies, maybe 10^-7, 10^-8, it depends on the particular one. So you're even more likely to see those in these large populations, and then if they have a selective advantage, they'll come up to a frequency that's easily observable. Now - ask Pat that question later.

QUESTION: I just want to clarify one thing. I liked your process at the -inaudible- investigation, and one of the aspects of it was the development of population - inaudible - what's been done so far?

Yeah. We really relied on the CDC and natural surveillance to get access to these. One of the things that has happened in the changed regulatory environment is that people have not wanted to cooperate - they haven't been allowed to save isolates, you know, there are large ranches for example in that part of Texas that have outbreaks of anthrax on a regular basis, and they won't report it, so I'm not sure the answer is sending in black helicopters to pick up dead cows, but that's the kind of situation we're in, where we have not been able to get access to material to define that - so we only had four or five new isolates from that Ames branch and then we were into China. And so that is - everything says that we've got it nailed - but it would be nice to have a hundred samples instead of five.

QUESTION: Paul this is just for clarification - someone mentioned earlier that there had been some laboratory work looking at stability of some of these mutations. Now I don't know whether the comment was made with respect to morph mutations, or lineage - you know, specific mutations and I am assuming that if it were the latter, they have been looked at experimentally.

Yeah, I don't know anything about the morph stability. I was not involved in that part of the investigation other than tangentially. But the lineage mutations we've validated by looking at, now, 2000 independent isolates around the world and we do see a very very small amount of reversion. So they're very stable, and again the five SNPs we've used for assays have been looked at very closely.

QUESTION: But no experimental work, serial propagation?

Well, I've know Rich for a long time - so we did a Linsky experiment on Bacillus anthracis back in the late 1990s, and we had it sitting in the freezer, and it was done with the Ames strain, in fact, and so the loci we were working with were the VNTR loci, and we had seen a small number of mutations in those loci, because they mutate at such a fast rate. I'm not sure we've ever done the SNP analysis on those, because we really didn't think it was worth the trouble.

QUESTION: Could you repeat that? What mutations did you see?

So the first typing system or fingerprinting system that we used is what's called a VNTR, variable number of tandem repeats...

QUESTION: Oh, I see, you didn't look for SPOs for example?

Now we did not - but you know, I've cracked open vials from ATCC and seen multiple morphs, Pat's the expert on this. But the SPO mutation - I'll tell you, when we selected for Cipro mutants we saw all sorts of SPO mutations - and you know, Cipro is a mutagen as well as an antibiotic, it's a mutagen if you're a bacteria anyway, and so there were a number of SPO-type mutations that did occur, probably due to deletions, since it's a double-strand break mechanism. I think SPO mutations are selected against in nature, of course, because you've got to have that spore for the ecological infective cycle, but...

QUESTION: Not necessarily in a flask?

Not in a flask, not when you're growing 35 ten-liter fermentation batches, or whatever it was.

MODERATOR: Well, thank you for your 121st lecture, here with us, since 2001, on this subject. Once again, our committee has...

-end transcript-

This certainly seems to validate all the work used to identify the strain used as the laboratory Ames strain - but what it does not address is the validity of the "morphs" which form a key element in the FBI claim that Bruce Ivins was the culprit. More on that later.