Some of you may recall Kyle Wiens, the tech company CEO who created a bit of a online fuss last year with his article in the Harvard Business Review, "I Won't Hire People Who Use Poor Grammar. Here's Why". In it, he argued that "good grammar is credibility", and that attention to grammar in writing is representative of attention to detail in all aspects of work. Well, he's at it again, this time with "Your Company is Only As Good As Your Writing". He defends himself from some of the criticism he received as a result of the previous article, but also addresses some of the challenges that companies, particularly technology companies, face with both internal and external communication. It's a worthwhile read, particularly because he also describes the internal "manual" that he and his colleagues developed to encourage good writing. When it was finished, they polished it up and put it online as the "Tech Writing Handbook". Though it may not be as useful to scientists who are only focused on writing academic papers, it certainly has the potential to help out anyone who has made (or is thinking of making) the jump to technical writing. Enjoy the tips and happy writing!
In recent years, the increasing popularity of open-access journals has resulted in scientific publishing that is quick, easily accessible, and more democratic - all great things, right? However, the New York Times has a new article out that looks at the dark side of this publishing revolution: new, "predatory" journals that take advantage of the boom in online publishing to dupe unsuspecting scientists out of potential thousands in publishing fees. The scams are various, and I strongly suggest hopping over to read the article in its entirety. As they say, forewarned is forearmed! And you, dear readers - any suggestions (or horror stories) you'd like to share with fellow scientists?
Many researchers come to us because English editing has been included in the list of changes requested before their papers can be published. However, this is frustrating for many, and understandably so, because they had their manuscripts reviewed for English errors prior to initial submission!
It is important to recognize that there are large differences in the quality of service that editors provide. Since research funding is often scarce, it can be tempting to go with an editor who will only charge 60-80 euros for a 20+ page manuscript. However, the editing will probably be light and focus exclusively on glaring errors, such as incorrect verb tenses, obvious idiom problems, and spelling mistakes. This type of editing will clean up the manuscript to a certain extent. However, it will do nothing to deal with the types of issues that most often provoke reviewers to ask for English editing: those pertaining to sentence structure and fluidity.
For example, take the following sentence, which came to us in an "edited" manuscript:
“Because of their abundance in most terrestrial habitats, their foraging decisions may have important consequences at the ecosystem level.”
At first glance, it seems correct - there are no problems with subject/verb agreement, incorrect verb tenses, or improper preposition use. However, if you look more closely, the sentence is actually rather confusing. It appears to be saying that “their foraging decisions” - the grammatical subject of the sentence - are abundant in the environment and thus have important consequences for ecosystems, which makes no sense.
Now consider our alternative:
“Because ants are abundant in most terrestrial habitats, their foraging decisions may have important consequences at the ecosystem level.”
A small but important change suddenly makes the meaning clear – the ants are the ones that are abundant, and it is their foraging decisions that are of consequence. These types of more complicated, yet crucial, grammatical issues often get ignored by those doing quick, cheap editing. In fact, we have recently spent a lot of time on this blog discussing similar issues (see our series on "Reader Expectations") because addressing this kind of problem is extremely important, but often overlooked, in helping readers (and reviewers!) follow your thoughts.
As a consequence, while we may charge as little as 60 euros if the editing is light and the manuscript short, we will not accept a job only to do it halfway; some manuscripts clearly need more thorough editing. It is for this reason that we offer price estimates only after having looked at a text. Additionally, if a manuscript has already been submitted, we like to look at reviewers’ comments to see examples of the types of corrections they are requesting. Knowing exactly what kinds of problems need to be addressed helps us, and ultimately our authors, to improve the language of the text as much as possible. It's self-serving for us to say so, but it really is true: investing a little more in your editing budget upfront will help you save time and money in the long run!
Among grammarians, few topics cause more debate than the Oxford comma. Disagreements about its use have been known to incite feuds rivaling those of the Hatfields and the McCoys, the Ghibellines and the Guelphs, or the houses of York and Lancaster - okay, okay, I kid. However, this simple comma has the power to raise tempers and ignite passionate debate more than any other grammatical topic I can think of. When news broke last year that the Oxford University Style Guide was abandoning its own comma, thousands (millions?) of its devoted fans took to social media outlets in lamentation, collectively breathing a sigh of relief when the report turned out to be false. Just what is this particular item of punctuation used for, and why do people care about it so much?
The first part of the question is easy to answer. The Oxford comma, also known as the serial comma, is the final comma used in a series or list, together with "and" or "or". Therefore, a sentence with a correctly used Oxford comma would be:
"We studied the drug's effect on the subjects' appetites, sleep patterns, and length of life." The same sentence without the Oxford comma would be "We studied the drug's effect on the subjects' appetites, sleep patterns and length of life."
Here you're starting to think, "Okay, but what's the big deal? Both of those sentences mean exactly the same thing!" In this case, you're right - both of these sentences would be interpreted by most reasonable readers to mean the same thing. However, consider the following examples:
"We quantified the expression of the two new genes, Maj10, and Mak20."
"We quantified the expression of the two new genes, Maj10 and Mak20."
In the first example, the commas are used to identify individual items in a list, and the reader can see that we've quantified the expression of four genes: two new genes (presumably identified previously in the text), plus Maj10 and Mak20. However, it's not clear in the second example if "the two new genes, Maj10 and Mak20" makes up a list of four different genes, or whether the comma is used simply to set off additional information that describes the noun before it ("genes"). If the latter is true, we have only two genes, named Maj10 and Mak20. See the problem? If we chose not to use the Oxford comma in our writing, sentence #2 could describe either situation. Some readers may infer that we studied four genes, while other readers, expecting to see an Oxford comma in this situation, will infer that we studied only two genes. By contrast, sentence #1 is clearly a list of four genes.
Now that we know what it is, then, let's deal with the second half of the question: why does it cause such emotionally charged debates? For the sake of full disclosure, I should reveal that I am a proud member of Team Oxford Comma, for two reasons. First, because Mr. Stanich taught me to use it when I was 11, and when I asked him "Why?", he said "Because". Old habits die hard. The real reason, however, is that it eliminates confusion. If I use the Oxford comma correctly and consistently, my readers will have no difficulty in interpreting sentence #1 as a list of four genes and sentence #2 as an identification of two genes. They will reason, rightly so, that if I'd meant #2 to be a list of four genes, I would have put an Oxford comma in it. Instead, if I use the Oxford comma haphazardly, or not at all, sentence #2 will, at best, sidetrack my readers with unnecessary head-scratching, and, at worst, lead them down the exact opposite path from what I intended.
Most writers who advocate minimal usage of the Oxford comma are journalists, following AP style guidelines. When you think about it, this makes sense - writing for a newspaper requires packing as much information as possible into limited physical space, and taking out individual punctuation marks could make a difference. In academic publishing, however, the benefit (space for 10-15 extra characters) doesn't come close to outweighing the cost (imprecision and potential reader confusion). If you're trying to communicate a complex, involved research project to unknown readers, why handicap yourself? Team Oxford Comma for the gold!
We're down to one last guideline from Dr. George Gopen and Dr. Judy Swan's advice to writers, published in their article "The Science of Scientific Writing". Using the following passage, let's explore one of the most crucial aspects of scientific writing - clearly identifying the subject of your story.
"Transcription of the 5S RNA genes in the egg extract is TFIIIA-dependent. This is surprising, because the concentration of TFIIIA is the same as in the oocyte nuclear extract. The other transcription factors and RNA polymerase III are presumed to be in excess over available TFIIIA, because tRNA genes are transcribed in the egg extract. The addition of egg extract to the oocyte nuclear extract has two effects on transcription efficiency. First, there is a general inhibition of transcription that can be alleviated in part by supplementation with high concentrations of RNA polymerase III. Second, egg extract destabilizes transcription complexes formed with oocyte but not somatic 5S RNA genes. "
Of the many problems with this paragraph, one of the most apparent is that its exact subject is unclear. Is it TFIIIA? Egg extract? Gene transcription? Every sentence implies something new, and further, we lack crucial clues that might help us identify which of all these potential subjects is the most important. According to Gopen and Swan, readers most often look for such clues in the verb, giving us our last reader expectation:
"Readers expect the action of a sentence to be articulated by the verb."
In this passage, pinpointing the action taking place in the text might help us to untangle the story the paragraph is trying to tell. Unfortunately, we have some decidedly unhelpful verbs, among which, "is" (three times), "are presumed to be", "has", and "can be alleviated". The verbs "are transcribed" and "destabilizes" are slightly more informative, but we still lack sufficient information to help us make more than an educated guess at the meaning of the passage. From the observation that "egg extract" and "TFIIA" are mentioned the most frequently in the text, Gopen and Swan make an assumption that these are the subjects and provide the following rewrite, relying more heavily on active verbs to tell the story:
"In the egg extract, the availability of TFIIIA limits transcription of the 5S RNA genes. This is surprising because the same concentration of TFIIIA does not limit transcription in the oocyte nuclear extract. In the egg extract, transcription is not limited by RNA polymerase or other factors because transcription of tRNA genes indicates that these factors are in excess over available TFIIIA. When added to the nuclear extract, the egg extract affected the efficiency of transcription in two ways. First, it inhibited transcription generally; this inhibition could be alleviated in part by supplementing the mixture with high concentrations of RNA polymerase III. Second, the egg extract destabilized transcription complexes formed by oocyte but not by somatic 5S genes."
Even after the rewrite, there are still many unexplained connections in the text, particularly those between TFIIIA as the limiting factor for transcription in the egg extract and the egg extract as an inhibitor of transcription in the nuclear extract. Because the verbs now describe the action of the paragraph more clearly, we have a better idea of what the original authors actually did. However, we're still left guessing as to what their hypotheses were and whether these results support or refute them. As scientific readers, however, this is the information we're most interested in! Unfortunately, at this point, we don't have any more information in this paragraph to help us, and we're left wondering about what should be the most important message of the entire text.
In summary, in all of the passages we've looked at (taken, you'll recall, from actual published papers), we've seen how poor writing hampers the flow of ideas from the authors to the readers. In some cases, the authors' central messages were reasonably obvious, and we were able to decipher the meaning of the passage with only a little extra work. In others, however, the meaning of the text remained obscure despite our best efforts to illuminate it.
It is important to put what we have learned in the context of real-life science. As we read these passages in the course of these blog posts (and re-read them, debated their meaning, re-structured them, and then thought about them some more), we engaged in an academic exercise in text analysis. However, in real life, harried researchers are unlikely to spend much time interpreting muddled writing, and journal editors even less so. There is often a tendency to say, "Well, the quality of the data matter more than the quality of the writing." Given the competition to publish, this statement is far from true. The quality of your writing will either highlight or obscure the quality of your data. The burden is on the writer to clarify his or her thoughts as much as possible, so that the reader can sit back and enjoy the story. Skilled, accomplished writers use Gopen and Swan's guidelines subconsciously to help their readers understand exactly what they are meant to understand, but the rest of us can learn to do so too!
As promised, here's a recap of Gopen and Swan's advice based on their studies of reader expectations:
"1. Follow a grammatical subject as soon as possible with its verb.
2. Place in the stress position the "new information" you want the reader to emphasize.
3. Place the person or thing whose "story" a sentence is telling at the beginning of the sentence, in the topic position.
4. Place appropriate "old information" (material already stated in the discourse) in the topic position for linkage backward and contextualization forward.
5. Articulate the action of every clause or sentence in its verb.
6. In general, provide context for your reader before asking that reader to consider anything new.
7. In general, try to ensure that the relative emphases of the substance coincide with the relative expectations for emphasis generated by the structure."
The next time you sit down to write a journal article, grant proposal, or even a presentation, glance at this list every once in a while and try to take its advice. It may not be easy at first, but little by little, the "story" of your text will emerge in a way that all readers will appreciate. In time, you'll find that you've internalized these guidelines so thoroughly that writing to fulfill your readers' expectations will become natural. Your readers (and editors) will thank you!
Gopen, G. and J. Shaw. "The Science of Scientific Writing." American Scientist. Nov-Dec. 1990. Accessed from http://www.americanscientist.org/issues/pub/the-science-of-scientific-writing/.
Thus far in our exploration of Dr. George Gopen and Dr. Judy Swan's advice to scientific writers (found in their article "The Science of Scientific Writing"), we've learned the importance of close subject-verb placement, the proper use of stress positions, and the benefits of appropriately structured topic positions. The real strength of these techniques, however, lies in their ability to identify logical gaps, areas where important information has been unconsciously omitted. To illustrate this, let's take a look at Gopen and Swan's next example.
"The enthalpy of hydrogen bond formation between the nucleoside bases 2'deoxyguanosine (dG) and 2'deoxycytidine (dC) has been determined by direct measurement. dG and dC were derivatized at the 5' and 3' hydroxyls with triisopropylsilyl groups to obtain solubility of the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. From isoperibolic titration measurements, the enthalpy of dC:dG base pair formation is -6.65±0.32 kcal/mol."
We can see some familiar issues right away: the subject is separated from the verb in the first sentence; "enthalphy" is mentioned in the first and last sentences, but nowhere in between; and the new material worthy of emphasis in the second sentence is not immediately apparent. By making some assumptions about the relative importance of the new material introduced here, Gopen and Swan produce the following revision:
"We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2'deoxyguanosine (dG) and 2'deoxycytidine (dC). dG and dC were derivatized at the 5' and 3' hydroxyls with triisopropylsilyl groups; these groups serve both to solubilize the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. From isoperibolic titration measurements, the enthalpy of dC:dG base pair formation is -6.65±0.32 kcal/mol."
We're starting to get an idea of what the original authors did, but we're no clearer on the connection between the derivatization with triisopropylsilyl groups and the measurement of enthalpy than we were before: once the sentences have been restructured to highlight the link between one topic and the next, it only becomes more apparent that that linkage isn't there. In all likelihood, the connection between the derivatization and the enthalpy measurements was so obvious to the authors that it never occurred to them to make it explicit, and some specialized readers might also be able to "jump" the logical gap. For those without extensive background in this area, however, Gopen and Shaw supply what they think the missing information might be, and complete their revision thusly:
"We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2'deoxyguanosine (dG) and 2'deoxycytidine (dC). dG and dC were derivatized at the 5' and 3' hydroxyls with triisopropylsiyl groups; these groups serve both to solubilize the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. Consequently, when the derivatized nucleosides are dissolved in non-aqueous solvents, hydrogen bonds form almost exclusively between the bases. Since the interbase hydrogen bonds are the only bonds to form upon mixing, their enthalpy of formation can be determined directly by measuring the enthalpy of mixing. From our isoperibolic titration measurements, the enthalpy of dG:dC base pair formation is -6.65±0.32 kcal/mol."
With the two linking sentences inserted into the paragraph, the connection between each sentence and the next become clear, and the paragraph becomes understandable even by those whose last experience with chemistry was during their undergraduate years. Here, the power of a thorough understanding of reader expectations becomes apparent: with these guidelines in mind, it is possible to create writing that is accessible to nearly everyone, not just to the specialists in your own narrow field. When writing your own manuscripts, it is also worthwhile to remember that even specialist reviewers and readers often have limited time to read the literature. They will thank you for making the experience more efficient!
I'd intended to wrap up our study of reader expectations this week, but there's still one major topic left to cover. In the interest of keeping the blog posts short and easily digestible, I decided to postpone the last topic, with a recap of all the major reader expectations, until next time. If you can't stand the suspense, head on over to The American Scientist to check out the original article!
Reader Expectations, Part 2
In my previous post, I summarized some of Dr. George Gopen and Dr. Judy Swan's advice to scientific writers, set out in their article "The Science of Scientific Writing." They start out by explaining that readers expect to find grammatical subjects and their respective verbs close together, and that any information placed between subjects and verbs is unconsciously viewed as an aside to the main action of the sentence. The authors also spend a great deal of time explaining the concept of the "stress position", the position at the end of a sentence or phrase where readers expect to find the most important material. As writers, however, simply placing important information where readers expect to find it doesn't completely fulfill our responsibility. In order to really understand a passage, readers must have a context in which to mentally situate all of the emphasized material. To illustrate this, Gopen and Swan introduce the concept of the "topic position".
If the stress position is usually found at the end of a sentence, the topic position is usually found at the beginning, identifying the subject under discussion and providing context for all subsequent information. Consider two of Gopen and Shaw's examples: "Bees disperse pollen," and "Pollen is dispersed by bees." On the surface, both sentences seem to give us the exact same information. However, with "bees" in the topic position, we would expect to find the first sentence in a paragraph about bees. The second sentence, instead, tells a story about pollen, presumably leading the reader on to more information about pollen. Here, the topic position gives us the context for all following information.
The topic position also serves another important function: it links old information, previously presented in the text, to new information that will appear in the subsequent stress position. By properly exploiting topic and stress positions, the writer can present information seamlessly and logically, in a manner that is both easy to understand and pleasing to read; this is what professional writers refer to as the "flow" of a passage. However, improper use of both can lead to chaos in a seemingly well-constructed paragraph. Case in point: Gopen and Shaw's next example, taken from an actual published paper.
"Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates at which tectonic plates move and accumulate strain at their boundaries are approximately uniform. Therefore, in first approximation, one may expect that large ruptures of the same fault segment will occur at approximately constant time intervals. If subsequent main shocks have different amounts of slip across the fault, then the recurrence time may vary, and the basic idea of periodic mainshocks must be modified. For great plate boundary ruptures the length and slip often vary by a factor of 2. Along the southern segment of the San Andreas fault the recurrence interval is 145 years with variations of several decades. The smaller the standard deviation of the average recurrence interval, the more specific could be the long term prediction of a future mainshock."
Here I'm going to quote Gopen and Shaw directly, because I can't express it any better than this:
"This is the kind of passage that in subtle ways can make readers feel badly about themselves. The individual sentences give the impression of being intelligently fashioned: They are not especially long or convoluted; their vocabulary is appropriately professional but not beyond the ken of educated general readers; and they are free of grammatical and dictional errors. On first reading, however, many of us arrive at the paragraph's end without a clear sense of where we have been or where we are going. When that happens, we tend to berate ourselves for not having paid close enough attention. In reality, the fault lies not with us, but with the author."
The main problem with this passage, they explain, is that the topic positions are constantly occupied by new material, in the place where readers expect to find context and linkage with old material. Because of this, it's difficult to determine exactly what story is being told: is it about the rate at which tectonic plates move? The recurrence time between shocks? The San Andreas fault? The topics jump around so much that one could read the passage through two or three times and not be entirely sure. Instead of finding the topics of the sentences in their expected positions at the beginning, as readers we must instead search through the passage and pick out the information that seems to be repeated relatively often, which we must then assume to be the topic of the entire paragraph. Many of the sentences seem to provide information about recurrence time, though this never explicitly appears in the topic position of any given sentence. If old material is not found in the topic position, it's also difficult to determine what, exactly, is the new information worthy of emphasis in each sentence. Using the assumption that the paragraph is about the recurrence interval between earthquakes, Gopen and Shaw proceed to highlight the most likely candidates for the available topic positions (linking backwards) and stress positions (important material presented for the first time), and they rewrite the passage as follows:
"Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates at which tectonic plates move and accumulate strain at their boundaries are roughly uniform. Therefore, nearly constant time intervals (at first approximation) would be expected between large ruptures of the same fault segment. [However?], the recurrence time may vary; the basic idea of periodic mainshocks may need to be modified if subsequent mainshocks have different amounts of slip across the fault. [Indeed?], the length and slip of great plate boundary ruptures often vary by a factor of 2. [For example?], the recurrence intervals along the southern segment of the San Andreas fault is 145 years with variations of several decades. The smaller the standard deviation of the average recurrence interval, the more specific could be the long term prediction of a future mainshock."
Although the paragraph is more slightly more readable, the logical gaps between sentences remain. To fill them, Gopen and Shaw propose "however", "indeed", and "for example", but only the original authors could tell us whether these connecting words represent their intended meaning. Even in the revised version, it is still unclear exactly what the authors are preparing us for in the rest of their article. Will we learn more about earthquake prediction? Or the difficulties in earthquake prediction? Something else entirely? Although the paragraph is structurally more sound after editing, its deficiencies in communication become much more obvious.
To improve the flow and readability of texts, then, Gopen and Shaw provide the following guideline:
"Put in the topic position the old information that links backward; put in the stress position the new information you want the reader to emphasize."
In this way, the writer is consistently providing context for readers, while drawing their attention to the material most deserving of emphasis. By using the topic position to link backwards, sentences flow seamlessly from one to the next, without logical gaps, while the stress positions contain the "take-home message" right where the readers expect to find it.
More to come! Stay tuned....
Gopen, G. and J. Shaw. "The Science of Scientific Writing." American Scientist. Nov.-Dec., 1990. Accessed from http://www.americanscientist.org/issues/pub/the-science-of-scientific-writing/.
The primary purpose of scientific documents is simply to communicate our data and our research to the world. Unfortunately, we often forget that successful communication, so crucial to the advancement of public knowledge as well as of our individual careers, depends ultimately on people outside of our control: our readers. Because readers have no way of asking authors real-time questions (at least not yet), the onus is on writers to ensure that readers understand exactly what they are meant to understand.
Luckily, the research of Dr. George Gopen and Dr. Judy Swan has provided us with clear, concrete guidelines to improve the clarity of our writing. Using studies of rhetoric, linguistics, and cognitive psychology, Gopen and Swan were able to identify a set of "reader expectations" most English readers unconsciously possess regarding how information is presented. When writers are aware of these expectations and make proper use of them, the likelihood of readers' understanding increases dramatically. Gopen and Swan's findings and advice to writers are described in detail in their article "The Science of Scientific Writing", which I highly recommend reading. However, for those running a bit short of time, here are some of the highlights.
First, the authors demonstrate the concept of reader expectations neatly and simply using the following charts:
time (min) temperature (ºC) vs. temperature (ºC) time (min)
0 25 25 0
3 27 27 3
6 29 29 6
9 31 31 9
12 32 32 12
15 32 32 15
The two charts present exactly the same information, but, as English readers, most of us will intuitively prefer the chart on the left; we expect context (in this case, the independent variable) to come first (on the left, since English is read from left to right). Without any additional explanation, most of us would reasonably infer from the first chart that someone took temperature readings every three minutes. I suspect we'd have more difficulty confidently reaching a consensus about the interpretation of the chart on the right.
With this in mind, let's see how they tackle the following example text:
"The smallest of the URF's (URFA6L), a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetriphosphatase (ATPase) subunit 6 gene has been identified as the animal equivalent of the recently discovered yeast H+-ATPase subunit 8 gene. The functional significance of the other URF's has been, on the contrary, elusive. Recently, however, immunoprecipitation experiments with antibodies to purified, rotenone-sensitive NADH-ubiquinone oxido-reductase [hereafter referred to as respiratory chain NADH dehydrogenase or complex I] from bovine heart, as well as enzyme fractionation studies, have indicated that six human URF's (that is, URF1, URF2, URF3, URF4, URF4L, and URF5, hereafter referred to as ND1, ND2, ND3, ND4, ND4L, and ND5) encode subunits of complex I. This is a large complex that also contains many subunits synthesized in the cytoplasm."
This passage is certainly difficult to read, but why, exactly? The sentences are long, true, with very technical terminology and abundant use of acronyms. If we remove the terminology and most of the acronyms, though, we still run into difficulty:
"The smallest of the URF's, and [A], has been identified as a [B] subunit 8 gene. The functional significance of the other URF's has been, on the contrary, elusive. Recently, however, [C] experiments, as well as [D] studies, have indicated that six human URF's [1-6] encode subunits of Complex I. This is a large complex that also contains many subunits synthesized in the cytoplasm."
Ask ten readers for the subject of the next sentence, and I'd bet you'd get five saying "URFs" and five with "Complex I". (Just to satisfy your curiosity, here it is: "Support for such functional identification of the URF products has come from the finding that the purified rotenone-sensitive NADH dehydrogenase from Neurospora crassa contains several subunits synthesized within the mitochondria, and from the observation that the stopper mutant of Neurospora crassa, whose mtDNA lacks two genes homologous to URF2 and URF3, has no functional complex I.")
Let's unpack this a bit. According to Gopen and Swan, one reason this passage seems so labyrinthine is that the subjects and verbs of sentences are frequently separated by as many as 27 words. Here, then, is our first reader expectation:
"Readers expect a grammatical subject to be followed immediately by its verb."
Once a reader finds the grammatical subject, his or her mind instantly starts looking for the verb, glossing over all intervening material as being less important. Of course, when you put 27 words between your subject and your verb, it's likely that at least some of those words will be significant to the meaning of the text! To avoid having your readers unconsciously skim over important parts of your text, then, it's best to keep subjects and verbs close together. Gopen and Swan suggest one possible revision:
"The smallest of the URF's is URFA6L, a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetriphosphatase (ATPase) subunit 6 gene; it has been identified as the animal equivalent of the recently discovered yeast H+-ATPase subunit 8 gene."
Of course, it's possible that the "interrupting" phrase between the subject and the verb is not all that important, as its position between the subject and the verb unconsciously indicates. In this case, Gopen and Swan advise that it's best to omit this phrase for the sake of clarity:
"The smallest of the URF's (URFA6L) has been identified as the animal equivalent of the recently discovered yeast H+-ATPase subunit 8 gene. "
Unfortunately, as readers we don't know which of these two revised sentences most accurately reflects the authors' intentions, and we would have to ask them to be sure. The two potential meanings of this sentence lead Gopen and Swan to their next reader expectation:
"Each unit of discourse, no matter what the size, is expected to serve a single function, to make a single point. In the case of a sentence, the point is expected to appear in a specific place reserved for emphasis."
As readers, we unconsciously emphasize the end of a sentence and assume that the most important part of any given sentence will be at the end; we build tension as we read in anticipation of an exciting finish. As writers, we can utilize this tendency in order to increase the likelihood of readers emphasizing the material that we want them to emphasize. For any given sentence, then, we should strive to put the most important material in the "stress position" at the end. By applying their guidelines about subject/verb separation and stress positions, Gopen and Swan revised the rest of the passage thusly:
"Recently, however, several human URF's have been shown to encode subunits of rotenone-sensitive NADH-ubiquinone oxido-reductase. This is a large complex that also contains many subunits synthesized in the cytoplasm; it will be referred to hereafter as respiratory chain NADH dehydrogenase or complex I. Six subunits of Complex I were shown by enzyme fractionation studies and immunoprecipitation experiments to be encoded by six human URF's (URF1, URF2, URF3, URF4, URF4L, and URF5); these URF's will be referred to subsequently as ND1, ND2, ND3, ND4, ND4L and ND5."
Note that each sentence makes one point, sentences with closely related points are joined by a semi-colon, and subjects and verbs appear together. However, Gopen and Swan freely admit that they had to make assumptions about what material from the original passage was worthy of emphasis, so while the text might read more smoothly, it may or may not represent the authors' desired message. There are several other possible interpretations that could work, but that only reinforces their argument: the original text was worded in such a way that many interpretations were likely, and it therefore failed to clearly communicate its meaning.
I'll continue with my summary of Gopen and Swan's article over the next couple of posts, but I highly recommend making the time to read the original in its entirety! More to come....
Gopen, G. and J. Shaw. "The Science of Scientific Writing." American Scientist. Nov-Dec. 1990. Accessed from http://www.americanscientist.org/issues/pub/the-science-of-scientific-writing/.
Grammar Guide #5: All about rates
In science, we often report rates: growth rate, death rate, and infection rate, just to name a few. However, the term "rate" is also one of the most frequently misused in scientific documents.
Consider the following examples:
1) In the experimental group, we observed that 65.7% of samples were infected, while the infection rate in the control group was much lower (42.4%).
2) The virus spread more quickly in the control group than the experimental group, providing evidence that the treatment was effective at slowing the infection rate.
In these two examples, the term "infection rate" is used with two completely different meanings: the first example uses "infection rate" to mean "percentage of samples infected", while the same phrase in the second sentence refers to a change in infection over time.
The problem here is that the word "rate" in everyday English is nearly synonymous with "percentage", giving us sentences such as:
"She has a very high success rate with these kinds of projects." = "She successfully completes a large percentage of these projects."
"The seat belt usage rate is up 3%." = "The percentage of passengers using seat belts increased 3%."
It's not wrong, but it's imprecise. Scientifically speaking, "percentage" means "part of a whole" and "rate" means "change over time". In documents where precision is key, such as academic papers, making your readers guess whether you intend "rate" in its common usage or in its scientific usage is, at best, a bit lazy and, at worst, can lead to a complete misunderstanding of what you're trying to say. Yes, in your particular field the term "infection rate" might be used all over the literature to mean "percentage of samples infected", so much so that it's become accepted, standard practice. I'd still argue that it's a bit sloppy, especially when it's not difficult to write with more precision. Here's the first example rewritten:
"We observed that 65.7% of the samples in the experimental group were infected, compared with 42.4% in the control group."
I know many of you will disagree with me, but small steps towards clearer writing really can make a difference! Try using "percentage", "proportion", or "frequency" (with the last being less ideal as it also implies a connection with time, but a move in the right direction nonetheless) and see if you don't feel just a little bit more confident that you've said exactly what you meant to say.
Grammar Guide #4: Less vs. Fewer
The difference between "less" and "fewer" is simple, but the two are often confused in usage. Here's the secret:
"Fewer" means "a smaller number of something", so use it for anything you can attach numbers to.
"Fewer samples were taken from site B than from site A (12 vs.15)."
"In the treatment group, fewer mice exhibited symptoms than in the control group (25 vs. 38; Table 1)."
Use "less" if you're not referring to individuals you can count, but rather to a decrease in a particular quality or concept.
"We had less success in raising viable embryos using this technique."
"We observed less change over time in group A than in group B."
The same rules also apply to "few" vs. "little", e.g.,
"Few of the incubated samples displayed fungal growth." (Few samples of many potential samples)
"We observed little growth in the experimental group." (Growth is the quality being observed, not something to be counted.)
Still confused? Put your questions in the comments!
About This Blog
A place for ideas, thoughts, and discussions running around Providing English Services for Scientists HQ
Photo used under Creative Commons from Xtremo