Ask Doctor Vector: March 2006

Wednesday, March 15, 2006

Larn sumthin' new

Hi. Since you're here, I assume you have a few spare minutes. Conveniently for you, I have a couple of suggestions on how to spend them.

Go here.

Read the essays, "Who can name the biggest number?" and "Return to the beehive".

It will be a better use of your time and energy that whatever you were planning to do here.

Cheers,

Matt

Monday, March 13, 2006

Wafted to Mars on currents of Vicks

They say that the sense of smell is the sense most strongly tied to memory. I'm not sure who "they" are, but I'm willing to bet that they're people who grew up with Vicks Vap-O-Rub. When winter set in, Mom would emerge from her bedroom with a greasy jar of Vicks clutched in one hand, like a crone with a poisoned apple, and hunt us down with what I swear was glee. We'd have to unzip our sleepers to halfway down our torsos so that she could slather ridiculous quantities of menthol mucus over our thin, quaking chests. Why it was a good idea to unzip your warm jammies and have some demonstrably crazy woman smear cold grease on your chest was quite beyond me. Like many aspects of childhood there was no logic and no escape; there was only the suffering.

To keep the Vicks from sliming up our jammies, Mom would set us up with bank-robber style bandanas, tied from real red hankies. You'd go to bed looking like Jesse James and wake up looking like an auto mechanic who had been strangled with his grease rag.

God forbid that you get sick enough to stay home from school, because then you'd have to wear the Vicks rag all day. My sick day activities were about the same as my usual post-bedtime activities--crawl into bed and read science fiction paperbacks, preferably with lurid covers featuring scantily-clad Earth women being menaced by horny tentacled horrors. Only on sick days, I didn't have to bring a flashlight.

A constellation of factors aligned to make 1987 the ultimate winter of my contentment. Having turned twelve in June, I became the sole occupant, scout, lord, and defender of the upstairs bedroom (the only upstairs room, a realm unto itself), which I'd helped Dad make fit for human habitation in the months previous. I'd also gotten a clock radio for my birthday. This gave me access to the local alternative station (KOFM, long may her memory dwell in the firmament of radionic awesomeness) for background music, Casey Kasem's Top 40 Countdown on Saturdays, and--crucially--Dr. Demento on Sunday nights. Dr. Demento came on after my bedtime, so I spent many a late Sunday night in total darkness, sitting sideways up against my headboard, neck turned at a 90 degree angle to place my right ear over the stealthy, shocking, mind-bending, gut-splitting sounds of Dr. Demento's weekly show.

That was also the year that I discovered Edgar Rice Burroughs. He invented Tarzan and wrote loads of pulp adventures in the 1910s, 20s, and 30s. But his first book, probably his most influential, and in my mind his best, is A Princess of Mars, from 1912. I could try to paraphrase it for you, but I'd inevitably end up copying Richard Wolkomir, who eulogized the book in a back page humor bit for the May 1987 Smithsonian. That's how I discovered Barsoom, and that's how you should, too. So here's the whole bit.

-----------------------------------

'Wafted to Mars, he fights four-armed green Tharks'

by Richard Wolkomir (c) 1987 (reprinted with permission of the author).
"I used to take out seven books at a time from the library," S.J. Perelman once confessed, "and sit in the kitchen, with my feet in the oven, eating cookies and reading trash."

Right move, S.J. As a former junk-literature junkie myself, I know you can't kick the habit until you publicly admit you're hooked. Space operas, mysteries, oaters, fan-at-the-bodice romances--whatever has you in its clutches, own up. Recount the plots until you blush for shame and vow never again to abuse your mind with such swill.

My own addiction began at age 8, with Tarzan of the Apes. My mind fizzed with images of chatty primates swinging on lianas. I snorted heavy stuff, too, junk-lit classics like Robinson Crusoe ("Me die when you bid die, master," said Friday), Treasure Island ("Yo-ho-ho, and a bottle of rum!") and King Solomon's Mines ("'Koom! Koom!' he ejaculated"). I OD'd on Sherlockiana, greeting strangers with statements like "beyond the obvious facts that you are a bachelor, a solicitor, a Freemason and an asthmatic, I know nothing whatever about you."

Then I hit bottom: comic books. Memory blurs. I see gangsters tommy-gunning their way to the moral: CRIME DOES NOT PAY. Mesomorphic superheroes walk city streets in tights and capes without anyone pointing and giggling. A pathetic monster-hero rises from a city dump, the sad product of household chemicals and industrial wastes reacting with discarded cold cuts.

On under-the-covers-with-a-flashlight science-fiction expeditions, I met advanced Betelgeusean gnats speaking BBC English. I popped whodunits like amphetamines. But I'm off the stuff now. After joining a Great Books Discussion Group, I'm into Aristotle and Kant. It feels good to wake up clean.

I think back to those lost decades when I mainlined books like Tom Swift and His House on Wheels ("Bless my opera glasses! It's Tom Swift!" said Mr. Damon), while my poor mind orbited Alpha Centauri. I wonder, why?

Elvis Presley, accepting an award from the Jaycees, said that as a kid, he had been a dreamer and had read comic books. In every story, he said, "I was the hero." It's true. Trash-lit can waft you on currents of overwrought prose to lurid realms of derring-do.

In the land of Trash, villians are totally rancid, so that a hero like Tarzan can beat his chest guiltlessly after dispatching a foe, and howl the good howl. Trashland's moral accounting always balances, blackguards rebuked, laurels to the brave. The race is always to the Swift, Tom or otherwise.

Trashland is also soothingly predictable. When trouble bubbles, Clark Kent invariably pops into a phone booth to change into Superman, and he never bangs his elbows pulling off his pants. Why Clark needed phone booths to molt into Superman, or why Superman needed Clark, I never knew or cared. Batman, Spiderman, they all had secret identities, just as all private eyes talked tough and ironic, and good cowboys eschewed black haberdashery--except Hopalong Cassidy, who at least had white hair. And the heroines of Trashland were always stunning and chaste, although skimpy dressers.

It's all come back to me because I found a boxful of old books in our attic. I pulled out A Princess of Mars, the first in Edgar Rice Burroughs' Martian series. Again I was on the Red Planet, locally known as "Barsoom." What a plot! John Carter, an ex-Confederate officer, battles Apaches in Arizona, inhales vapors in a cave and sees a red star. "...it was Mars, the god of war, and for me, the fighting man, it had always held the power of irresistible enchantment," he says.

Wafted to Mars, he fights huge four-armed green "Tharks." But their rifles that shoot "radium" bullets 300 miles are of little threat to Carter, who can jump 60 feet because his "earth muscles" are overwhelming in Mars' weak gravity. Says he, "I have never regretted that cowardice is not optional with me." He becomes a Thark warrior, riding eight-legged "thoats." Meanwhile the Tharks capture Dejah Thoris, Princess of Helium.

Naturally, she is gorgeous. And, like all Burroughs heroines, she haughtily disdains the hero. But he vows to save her, for the Thark idea of fun is torture. Pondering their noisome ways, he gloomily notes, "In one respect at least the Martians are a happy people; they have no lawyers."

Carter fights King Kong-size white apes, wrestles bellicose Tharks, is put in a dungeon. "...cold, sinuous bodies passed over me when I lay down, and in the darkness I occasionally caught glimpses of gleaming fiery eyes, fixed in horrible intentness upon me," he says. He broods over Dejah Thoris, "a woman who was hatched from an egg." But Cupid wins: Carter becomes a Prince of Helium and a doting husband to the oviparous lady. "In a golden incubator upon the roof of our palace," he reverently explains, "lay a snow-white egg."

I'm clean now, ready to plunge into The Mill on the Floss. First, though, for the sake of scholarship, I may just scan The Warlord of Mars, in which John Carter battles his way into the Temple of the Sun, where Dejah Thoris is a captive. It's nip and tuck from the Valley Dor to the dread Carrion Caves, John fighting triumphantly beside his hatched son Carthoris.

P.S. Cancel my appointments. And tell the folks at Great Books that something's come up. It looks like I'll be missing the next meeting.

-----------------------------------------

Putting that article in the path of a bibliophilic 12-year-old was like dropping a match into a powderkeg. A Princess of Mars was easily the coolest book I'd ever heard of, and I hadn't even seen a copy yet. But reality intruded: this was rural Oklahoma, in the eighties. There was no Amazon.com, no local Barnes & Noble superstore. I had to wait for the next trip into town, so I could go to Waldenbooks and place an order and wait two weeks for the book to come to the store, at which point I'd have to cajole another ride into town to pick it up. Two weeks! When you're twelve, that might as well be eternity.

I remember that particular eternity quite well. I read and reread that Richard Wolkomir article so many times that I lost track. At one point, I had the entire thing committed to memory, and could rattle it off flawlessly. Even now phrases like "blackguards rebuked, laurels to the brave" or "the race is always to the Swift, Tom or otherwise" flit through my brain like butterflies. When I got to Berkeley and actually got around to reading Aristotle and Kant, I wondered if it should feel good to wake up clean.

Eventually, after the sun had become a cold cinder, the universe had collapsed into a monobloc and a new Big Bang had produced a new universe, complete with a new me--i.e., after the requisite two weeks--my books arrived. I knew that there was a Mars series, and that it included A Princess of Mars and The Warlord of Mars, so I'd ordered them both. They were $1.95 or $2.95 apiece. I believe it was still $1.95, because most sci-fi paperbacks were$2.95 at the time and the really pricey ones were $3.95.

BUT, to my horror, The Warlord of Mars is the third book in the Barsoom series. That vile deceiver, Wolkomir, had neglected to mention Book 2, The Gods of Mars, wherein John Carter must face the blue-skinned cyclops people, with tentacles for arms and mouths for hands, and escape the Holy Therns (I later learned).

Catastrophe! I immediately ordered Book 2, knowing that another cycle of the universe would have to come and go before I could find out what happened between Books 1 and 3. The only question was, would I have the moral fortitude to wait it out? Could I leave Book 3 untouched for two whole weeks while I waited for Book 2 to arrive?

I honestly can't remember if I gave in or not. I think not. The important thing, the thing that ties this all together, is that right after the first books arrived I got sick, and I spent a few happy days up in my lair, swathed in Vicks, listening to Phil Collins's "Another Day in Paradise" and Tears For Fears's "Sowing the Seeds of Love" in regular rotation on KOFM, following John Carter across the ochre plains of Mars.

It all ties together. The wistful, bittersweet strains of "Another Day in Paradise" take me back to dying Barsoom, with its empty cities crumbling on the shores of dry oceans. The smell of menthol takes me back to that little room at the top of the stairs, where I climbed the stairs of Cirith Ungol with Frodo and Sam, plotted the overthrow of the Harkonnens with Stilgar and Paul Atreides, and, most of all, stood by John Carter and Tars Tarkas as they saved the twin cities of Helium from the hordes of Tal Hajus.

I'm sick, it's cold, and the humidifier in London's room is pumping menthol into the atmosphere at an alarming rate. All of the conditions are set. I think it's time to go on vacation.

Friday, March 10, 2006

Generic hostility, Part 8

And more from Randy. My response is at the bottom.

----------------------------------

Subject: Take that, space monkey!

In reply to the two points you posted on your blog:

But I also find genera damn useful in my day-to-day work. Taxonomy is just a model, and models are judged on utility. I'm a nominalist (as opposed to an essentialist)

Ok, first off, why is it more useful to say "the genus Apatosaurus" rather than just "Apatosaurus". They convey the same information. And the second option keeps ecologists from using taxonomic ranks as ecological units! I can't stand all those papers comparing the ecologies of 'genera'. Can you tell me how 'genera' are useful in your day-to-day use where you can't just
say "the clade Apatosaurus" instead? You still haven't answered why you think genera are real. To be a distinct level (rank), they must have some emergent property - what is that?

i.e., monophyletic groups of the same phylogenetic depth that we usually associate with genera

What do you mean "same phylogenetic depth"??? Do you mean branch length, or number of twigs within the clade? And if you mean number of twigs (i.e., diversity), how can you ever be sure the numbers are equivalent? There are distinct twigs even within species, the cryptic diversity everyone loves to talk about. And that's why we should get rid of species too. They don't seem to have any consistent emergent properties either. I just cannot wrap my head around how any of these things can be comparable in a meaningful way, other than if you are comparing sister groups. I feel like people who hang on to ranks argue desperately for them because they are comfortable and familiar, not because they have a sound scientific basis. Afterall - they are a pre-evolutionary concept! But look at me, I'm starting to sound like a zealot. I think we may have to agree to disagree.

On the other point, the alleged prematurity of touting this method as a genus-catcher when it's so new and so little-tested: I didn't publish it in Nature, or chisel it into the face of a mountain.

Yeah, but you did clutter my inbox with a mass email that taunted phylogenetic taxonomists. And it probably filled up my email account causing the email that said that NSF would fund the "Randall Irmis Center for Triassic Research" to be bounced back to the sender.

Randy

------------------------------------------

I am starting to think that any point I make will get shredded because it will involve a generalization. We're not even arguing on points anymore; you're shredding my parenthetical asides because they're not PC (phylogenetically correct). Nothing that I can say about genera is going to survive; even if I say that they're not real but wouldn't it be neat if they were, you still nail me. In short, we don't have much to talk about.

I think we may have to agree to disagree.

Hmm. Since your position is "Ranks are illusory and anyone who uses them is nuts", the implication of the 'agree to disagree' gambit is that I believe in ranks. My own position is that ranks are not real, but they weren't just applied willy-nilly, and it might be worth (a) considering whether ranks convey any information that we can salvage, and (b) being intrigued by methods that give answers that appear to correlate with what we call ranks (that's two conditionals, please to heaven don't nail me when I'm just trying to talk about what I'm trying to talk about). But even taking that position means that I have to talk about ranks, and I can't do that, even parenthetically, without getting nailed. We're not even ascending to the level of conversation. We can't talk about what I want to talk about because the words one uses to talk about it have been outlawed.

Plus, I'm sure there's some law of Internet discussions that says that the first person to start complaining about how the other person is arguing automatically loses for being a whiner. So, fine, I lose.

Yeah, but you did clutter my inbox with a mass email that taunted phylogenetic taxonomists.

Boy, this is dense. Let's unpack it.

1. So now even mentioning ranks, even hermetically sealed inside multiple barriers of conditionals, constitutes taunting? I suppose that when I referred to Katie as Katie instead of Katherine that was sexual harrassment. And now I'm harrassing her further, demeaning the cosmic import of her personhood by using her as an example. It's such a violation. Somebody shut me up already.

2. NICE implication that I'm not a phylogenetic taxonomist. Tell me, do you still beat your girlfriend? In any case, I can prove that I am a phylogenetic taxonomist. I just need a new taxon to describe. You wouldn't have any unidentifiable teeth laying around, would you? Oh, snap!

3. Hey, if you don't want your inbox clogged, filter me out. It's invisible and painless. I don't even have to know about it.

Your ball.

Write this way, Part 6

More from Eric Harris.

-------------------------------

First, for clarification:
I think that the separation between what consititutes an idea versus what constitutes data is dependent on scale. The reason I'm skeptical of this distinction is because if we compare broadly across cultures or far through time, the idea/data distinction becomes blurred. BUT, it's a fairly clear distinction within the context of the last 100 years of science, or so. That's why I forged ahead with the distinction in my last email (considering it within a smaller scale).

I'd mainly like to take issue with the implicit point of your argument - that ideas are a homogeneous group of entities. My point about the half life of ideas is NOT that the half life of ideas is shorter - it's that SOME ideas move more quickly than others. I guess what I'd like to get at here is that I don't consider "ideas" to be a homogeneous category. Some ideas are more fundamental than others. Perhaps the more superficial ones move quickly, but the deep seated ideas may change much more slowly. Maybe your suggestion of electronic and quick publication would work for the ideas that are less fundamental ("solutions to the problem", like I mentioned last time) but an idea that really changes the landscape of a field (the "problem" itself) may need a longer gestation period. Ideas can be immature. Especially if they contradict some fundamental assumptions that people have. Blurting out an idea in an electronic journal may not be appropriate for ideas that could fundamentally change the way people think about the world. In these cases, a steady accumulation of data to support your theories and taking the time to work with your idea are probably more important than getting it on the internet. Like I mentioned previously, I believe that an idea isn't a substance that's in your head, but a combination of an insight, research, organizational work, cultural knowledge, and interaction. I imagine it's like writing a book. It's not that some people have the book in their head and then type it out. The idea develops as you write it and research it. And develops further as you stand by it and defend it in the face of opposition. But the degree to which all of these elements in the development of the idea are in play will vary depending on the "size" of the insight. Some - perhaps more suited for your method of electronic publication - will be short insights that solve a specific problem - but others need a different form of birth. You mention that ideas are r-selected, and data are k-selected.

Well, there's undoubtedly the mice and elephants in the safari of ideas too.

{Which gets to a question: can ideas be different sizes? and what does that mean, exactly?}

Lastly - I've been thinking about whether ideas really do change faster than data. The more I think about it, the more I think that data can die as quickly (sometimes quicker) than ideas. I was just flipping through a recent issue of Systematic Biology - the oldest reference I noticed was an old taxonomic monograph from 1854 (point 1: data lasts longer). But in another paper, the oldest reference was a paper on the theory of chromosomal evolution from 1971 (point 1: ideas last longer) When I think of my own research in phylogenetics, morphological data has largely been replaced by molecular data, but the method (theory) of phylogenetics has stuck around (point 2: ideas last longer). And some forms of data have been completely replaced (chemotaxonomy anyone?) (point 3: ideas). But the methods for doing specific problems in phylogenetics are constantly changing (point 2: data). I checked the 'journal half-life' metric on the journal citation index of web of science: biology & philosophy (ideas oriented) = 7 years, Systematic botany (~data) = 7.5 years. This is only one comparison, and a consideration of my specific situation. But I definitely do not think that it is clear and evident that ideas move more quickly than data.

Thursday, March 09, 2006

Generic hostility: state of play (Part 7)

A few last thoughts (for now) before I sack out:

As usual when things get talked out, we're converging in some areas and sticking to our guns in others. I think the main battlegrounds now are these: (1) as an ardent rank-free phylogenetic taxonomist, Randy objects to me talking about genera as ranks, and (2) he objects to me talking about this method being used to identify genera when it's been tested on so few things.

If I've left anything out or mischaracterized his views, I'm sure he'll let me know. :-) Watch this space.

To which I respond:

(1) Look, bitches, I learned my phylogenetics at Brent Mishler's knee. It's not that I don't get the arguments for rank-free taxonomy. I love 'em, in fact. But I also find genera damn useful in my day-to-day work. Taxonomy is just a model, and models are judged on utility. I'm a nominalist (as opposed to an essentialist), so I'm happy to use genera as useful mental constructs without believing that they Really Exist As Ranks In The Tree Of Life. That's the main reason I was (and am) so excited by the MALDI-MS stuff--IF there was a method that picked out genera (i.e., monophyletic groups of the same phylogenetic depth that we usually associate with genera) based on objective criteria, that would be something; it would be a little bit of evidence that maybe genera (i.e., ifyouknowwhatImeanandIthinkyoudo) are a little bit real, that there is something special about that particular phylogenetic depth, that we're not just whistling Dixie. Please don't let the length of that sentence disguise the fact that it's dangling from a very big IF.

(2) On the other point, the alleged prematurity of touting this method as a genus-catcher when it's so new and so little-tested: I didn't publish it in Nature, or chisel it into the face of a mountain. I didn't even have it here on the blog to begin with. It was in a damn e-mail, and if a man can't let his enthusiams run away with his good sense in e-mail, then we are living in a cold, dead world indeed.

Also, another labmate, Katie Brakora, sent this in response to my flame attack, and has graciously given me permission to post it.

-----------------------------------

Subject: Re:

Eggggscellent. Standing ovation, from the far corner.

Seems this new method's main (and significant) contribution (if only to vertebrates, which are implicitly important to, approximately, 6.5 billion people on earth - or did we hit 7 billion the other day?) is adding (a small corner of) the molecular dimension to those "entities" of the distant past which are only known from morphological characters, and, if you're lucky, histological ones. As molecular characters, they are subject to the same problems and considerations as those from the living, including problems of evolutionary rates, long branch attraction, etc, ad nauseum. So the fuck what. We can work with this.

As for genera, if we define them monophyletically (a reasonable goal, in my book, when combined with ecological labels as an alternative method of grouping "stuff"), then presumably we're smart enough not to rely on a single criterion for membership. Context, people. It's another tool in the bag, weapon in an increasingly fine arsenal. Only time will tell if it's as useful as the 1911 Colt .45. It's our job to find out.

Yes, shoot me for my use of parentheses. They are among the finest of non-word literary inventions, second perhaps only to the even more emphatic (-) thingy.

Katie

---------------------------------

To which I replied:

---------------------------------

Hi Katie,

I like your "just another tool in the kit" stance. All I intended to convey was, "Hey, this is cool," but obviously what came across (to some) was "Hey, stop yer grinnin' and drop yer linen, I've got the universal panacea bottled and ready for sale over here!"

And no need to apologize for parentheses, semicolons, em-dashes, ellipses, italics, ALL CAPS, excessive quotations, and especially long lists set off by commas; all of these are useful--nay, indispensable--tools of the adroit writer. Indeed, if I had my way every one of my sentences would consist of two clauses joined by a semicolon; that's just how I roll.

Cheers,

Matt

------------------------------

Finally, really, whether the winds of fate brought you here right after this was posted or six months hence, if you have something to contribute, please do e-mail me or post a comment. I will get back to you.

Generic hostility, Part 6

Randy's turn. The Alroy paper he's talking about is this one:

Alroy, J. 2003. Taxonomic inflation and body mass distributions in North American fossil mammals. Journal of Mammology 84(2):431-443.

And obviously he did relent and give me permission to post all of this.

--------------------------------------

Subject: RE: Piss and piss

Fair enough. Do you mind if I post it all on the blog? :-) Seriously.

Actually, I'd prefer if you don't. If only because my first email and your first response were rather vicious. I might be a raging lunatic, but I don't want it clear to see on the internet! You could paraphrase our arguments though.

Okay, so maybe that was a bit sweeping. But wouldn't it be interesting if there was even a single category of data that even for some vertebrates picked out what we normally label genera? I went a bit overboard last time.

But I don't see how you escape the circularity of saying it goes to the resolution of genus, and then using it to separate genera! As I said before, it is self-fullfilling. This is conflating definition and diagnosis (in the broadest sense, not necessarily cladistically). You can't define genera on osteocalcin sequences and then diagnose all genera on the same data - there needs to be separation. Plus, even the best character data method fucks up sometimes. That is why we use monophyly (natural grouping) to define our groups, and then we could use osteocalcin to diagnose the genera.

I don't think genera are based on nothing.

I'm not sure what you mean here. I agree that genera are built on phylogenetic data (even pre-cladistically). And certainly, phylogenetic systematists who use Linnean ranks make sure that there taxa (inc. genera) are monophyletic. However, I don't think that genera or species are anything real in terms of being a special level with emergent properties of all their own. For example, you can't say, oh, there is clearly a level of selection at the genus level that is unique to the genus level (it is just normal clade selection). So, why should we single out the clades as "genera". Just call them clades, cause they ain't special!

At least with fossil vertebrates, I think genera are more solid than species. People have argued for decades about how many species of Apatosaurus and Diplodocus there are, but no-one argues about whether a particular specimen belongs to Apatosaurus or Diplodocus. For what it's worth, fossil genera seem to correlate with recognizable chunks of morphospace, better than species anyway. I've talked to some mammalian paleontologists about this and gotten a similar response.

I think this is mostly a result of the distribution of characters among taxa. Genera these days are more often diagnosed with good hard autapomorphies, whereas species generally are diagnosed using unique combinations of character-states (without autapomorphies). This means that at the species level, it is easier to argue that your characters are a result of individual variation, etc. And this is where the lumpers and splitters continually fight.

I don't think the trend you see in sauropods is true for even all dinosaurs. It would be interesting, we should ask Mike Benton or Matt Carrano (who have large taxic databases) if they see more turnover in genera or species through paleontologist time. I've included an interesting Alroy paper on the subject.

Sorry, I'm not on board. I don't see why a method that is useful for identifying monophyletic groups in one clade shouldn't be used just because it doesn't pick them up in all clades.

That's because you're misunderstanding my argument! As I say later in my previous email, it's the difference between taxonomy & phylogeny. The method is VERY useful for understanding phylogeny, but not for defining your taxonomy (the act of ranking things, etc.).

But I don't think genera are going away, at least not anytime soon, and I'd rather have objectively delineated genera, even in one little corner of the tree of life, than nothing.

Maybe maybe not. But I don't think genera going away is a problem. People are already revising alpha taxonomy with the idea of making groups monophyletic. The only difference in the future is that we just won't call them "genera", they'll just be a low-level clade. And that's perfectly fine. The names won't change. SO I don't understand why people are so reticent to jettisoning the little prefix "genus". Of course, for the sake of nomenclatural validity, none of us will be doing this until the Phylocode comes into effect. Nevertheless, in a paper I have in press, I phylogenetically define and diagnose a genus - the reviewers had no problem with that.

If you disagree with any link of the chain, you're not going to like what I'm saying.

I disagree with (1) definitely in the sense that these genera are not "special" or consistent across clades (even within say Dinosauria). With (2), I definitely recognize that there is a patchy distribution across morphospace, but I don't think genera encapsulate it successfully. Only a phylogenetic taxonomy that is rankless does a good job. Why? Because the amount of empty morphospace between two sister genera is different, not consistent. And this is simply a reflection of what morphospace is occupied. Also, I never understand why people are so fast to haul out constraint when they talk about patchy morphospace. There definitely is constraint and modularity and all that jazz, but I think a lot of empty morphospace is mainly due to differential extinction!

And osteocalcin might not work as advertised, or maybe it only appears to work because it ticks over every 2 million years and just by chance all of the genera on which it's been tested so far are more than 2 million years old. There are a jillion things that could derail this. But IF it works, it will be interesting, and useful, and cool.

Let me reiterate. I'm not criticizing osteocalcin's potential for being a new character data set. I'm only skeptical of claims that it can "pick out" genera. I think it has wondrous potential for improving our phylogenetic view of extinct taxa.

Is there some better way to say it?

Not make the claim until you have better evidence? Sorry, I'm just jaded and cynical, and I want proof in the pudding before someone makes a statement about it.

P.S., yeah, they sequence the peptides.

Regards,
Randy

Wednesday, March 08, 2006

Generic hostility, Part 5

My turn again.

---------------------------------

Subject: RE: Piss and piss

BTW, I'm not copying this to everyone because I don't want to clutter their inbox.

Fair enough. Do you mind if I post it all on the blog? :-) Seriously.

Also, thanks for taking my rant with the large grain of salt that it called for.

Rightly or wrongly, your email came off as a statement about taxonomy, not phylogeny. I was objecting to the statement that "we might finally have an objective basis for recognizing genera". If you are going to make this claim, then I think you need to address the body of systematic literature that has argued over this for the past 50-100 years as well as have good evidence that the method reveals some emergent property at a particular level.

Okay, so maybe that was a bit sweeping. But wouldn't it be interesting if there was even a single category of data that even for some vertebrates picked out what we normally label genera? I went a bit overboard last time. I don't think genera are based on nothing. At least with fossil vertebrates, I think genera are more solid than species. People have argued for decades about how many species of Apatosaurus and Diplodocus there are, but no-one argues about whether a particular specimen belongs to Apatosaurus or Diplodocus. For what it's worth, fossil genera seem to correlate with recognizable chunks of morphospace, better than species anyway. I've talked to some mammalian paleontologists about this and gotten a similar response.

My personal opinion (and one reflected in the phylogenetic nomenclature literature) is that nomenclature and taxonomy should reflect relationship (no problem with the new method), and that it should be consistent through the tree of life, because life is monophyletic, so the same rules apply (here we have a problem).

Sorry, I'm not on board. I don't see why a method that is useful for identifying monophyletic groups in one clade shouldn't be used just because it doesn't pick them up in all clades. mtDNA is pretty good at picking out low-level vertebrate clades, but eubacteria don't have mitochondria. That doesn't mean that we can't use mtDNA where it's useful. Similarly, if these proteins give us monophyletic supraspecific groups in bony vertebrates, I don't see why we shouldn't use them. And if the things they pick out correspond to our usual notions of genera, I don't see why we can't call them that. I know, I know, we should all renounce ranks of all kinds. Believe me, I know those arguments forward and backward. But I don't think genera are going away, at least not anytime soon, and I'd rather have objectively delineated genera, even in one little corner of the tree of life, than nothing.

When doing taxonomy (not phylogeny), I want something that has continuity across clades. Why should I pick osteocalcin markers as my "genus" separator over another character? Why don't I just pick the 1028 base pair of the cytB gene, or something else? As long as it gives me the resolution I want, I can pick anything; hence why genera will always be to some extent subjective.

Ha. Now you're onto something. Yeah, we could choose just about anything to delineate genera, if we thought that was a worthwhile exercise. Here's my position:

(1) Many workers think that there is "something to" osteologically-identified genera of vertebrates. They seem to cohere better than either species or suprageneric taxa.
(2) What if we're not just whistling Dixie; what if there is a sound genetic and/or developmental basis to the chunks of morphospace we designate as genera?
(3) And what if osteocalcin picks out those chunks?

If you disagree with any link of the chain, you're not going to like what I'm saying. I can live with that. It is just a chain of not-rigorously-supported inferences and what ifs. But I am excited about the potential. And sure, we might be kidding ourselves about genera. For half a century, people peered through telescopes and drew detailed maps of nonexistent Martian canals, too. And osteocalcin might not work as advertised, or maybe it only appears to work because it ticks over every 2 million years and just by chance all of the genera on which it's been tested so far are more than 2 million years old. There are a jillion things that could derail this. But IF it works, it will be interesting, and useful, and cool.

Then how can one claim that it is genus-specific if not very many things have been tested!

Grrr. Seriously, all that anyone has said is that SO FAR it APPEARS to pick out generic differences in things THAT HAVE BEEN TESTED. That's at least three conditionals. What more do you want? Is there some better way to say it?

OK, and now for a constructive question: What exactly is the method looking at? Are they looking at the sequence of amino acids, or the folding of the protein? I would expect the former to be easier to get at than the latter in the fossil record.

Beats me. Darren forgot to include the relevant papers in his bibliography, and I haven't bothered to look them up yet. But Google is just a click away...here we go.

Nielsen-Marsh et al. 2002. Sequence preservation of osteocalcin protein and mitochondrial DNA in bison bones older than 55 ka. Geology 30(12):1099-1102.

Nielsen-Marsh et al. 2005. Osteocalcin protein sequences of Neanderthals and modern primates. PNAS 102(12):4409-4413.

Okay, those are just the refs. I will attach the papers, and read them myself, and get back to you on how this works. And you get back to me with a further smackdown, and I will exercise more restraint in my future responses.

All the best,

Matt

Generic hostility, Part 4

Both of the previous messages were sent around to everyone who had gotten the first one, and some observers got the impression that Randy and I had our hate on. Actually, as this response by Randy shows, we were just engaged in a frank discussion of ideas.

----------------------------------

Subject: RE: Piss & Vinegar

BTW, I'm not copying this to everyone because I don't want to clutter their inbox.

I wasn't attacking the validity of the method to provide new, important phylogenetic data at high resolution. To this extent, I think it is really exciting and has great potential. Rightly or wrongly, your email came off as a statement about taxonomy, not phylogeny. I was objecting to the statement that "we might finally have an objective basis for recognizing genera". If you are going to make this claim, then I think you need to address the body of systematic literature that has argued over this for the past 50-100 years as well as have good evidence that the method reveals some emergent property at a particular level.

My personal opinion (and one reflected in the phylogenetic nomenclature literature) is that nomenclature and taxonomy should reflect relationship (no problem with the new method), and that it should be consistent through the tree of life, because life is monophyletic, so the same rules apply (here we have a problem).

But if it works for vertebrates, if it gives us something beyond
morphology to help figure out how fossil taxa are related, then
I say it's a good thing. How could it possibly be any worse than
our current ideas of what constitutes a genus, which are based
on..uh..er..ub..that's right, NOTHING.

Well, I agree with your first sentence (phylogeny). But regarding the second sentence, why should we be looking for what constitutes a "genus" if there's no evidence that such a rank actually exists in the hierarchy of life (taxonomy)! Actually, I would say traditionally genera are based on some criteria, but these criteria are unique to each clade (and the workers working on that clade). Using this method to evaluate vertebrates is no different, it is unique to Vertebrata. When doing taxonomy (not phylogeny), I want something that has continuity across clades. Why should I pick osteocalcin markers as my "genus" separator over another character? Why don't I just pick the 1028 base pair of the cytB gene, or something else? As long as it gives me the resolution I want, I can pick anything; hence why genera will always be to some extent subjective.

No, it hasn't been tested in every genus of everything that ever
lived. The method was just invented last year.

Than how can one claim that it is genus-specific if not very many things have been tested! Again, I'm not denying its value as a phylogenetic tool and possible major source of info in the fossil record.

OK, and now for a constructive question: What exactly is the method looking at? Are they looking at the sequence of amino acids, or the folding of the protein? I would expect the former to be easier to get at than the latter in the fossil record.

Regards,
Randy

Generic hostility, Part 3

...which prompted me to fire this off (verb chosen deliberately).

In my unholy wrath, I forgot to include a subject line, and I was mistaken about MALDI-MS; as the first Nielsen et al. paper was published in 2002, it was definitely not invented last year. Neither of those errors affect my arguments.

------------------------------------------

Subject:

[blasphemy deleted]

What "grand claims" are you referring to? I said _appear_ to be genus specific. No, it hasn't been tested in every genus of everything that ever lived. The method was just invented last year. And in case you missed it, the blog post that brought it to my attention (Darren's) was all about how our current taxonomies probably need to be overhauled anyway.

And don't even get me started on the "it doesn't apply to everything that's every lived, so it's worthless" argument that you vomit up at the end of your message. As if it weren't implicit in everything that everyone ever says about anything, YOUR MILEAGE MAY VARY. If it doesn't apply to moss, then don't fucking apply it to moss. But if it works for vertebrates, if it gives us something beyond morphology to help figure out how fossil taxa are related, then I say it's a good thing. How could it possibly be any worse than our current ideas of what constitutes a genus, which are based on..uh..er..ub..that's right, NOTHING. So even if your doomsday scenario comes to pass, wherein all vertebrate zoologists start revising genera to match the results of the MALDI-MS, then at least we vertebrate zoologists will have a genus concept that's based on objective reality.* Yes, everyone else will still be fucked. Exactly as fucked as we are right now.

* Which is not what I was even advocating. I'm just pumped that we have something besides morphology that might possibly indicate relationships in fossils, since morphology hasn't exactly done a great job of sorting out extant species OR genera, as we're learning daily.

I guess I could have anticipated all of your objections in advance, and instead of writing

Certain bone proteins appear to be genus-specific.

I could have written

Certain bone proteins, which are only present in vertebrates with bones and therefore hardly worth looking into, appear to correlate with the imaginary entities called genera that we've been using for 250 years with no objective basis whatsoever, in the handful of taxa that have been tested in the handful of months since the method was invented. Please don't get the least bit excited about this because it hasn't been vetted for ~30-100 million extant species, let alone the billions of extinct species, and even if had been, it still wouldn't apply to non-vertebrates, and genera are imaginary anyway, so describing anything as "genus-specific" is just a fancy way of airing your abject stupidity.

But I assumed that I could deliver roughly the same information more succinctly without having some pedant point out how the sweeping statements that I didn't make weren't actually universally true.

Piss off,

Matt

Generic hostility, Part 2

Randy Irmis sent this in response. He was afraid that it didn't show him at his best; I assured him that if anyone comes off looking like a loony or an asshole in this exchange, it's definitely me (see next post).

-----------------------------------------

Subject: Re: Cryptic species and real genera; also, lasers

Certain bone proteins appear to be genus-specific.

Before you gun me down with your quick-draw "What's a genus?"
Wake&Mishler revolver, just hang on a sec.

Ooops, your second is over! How many "genera" have been tested? If you state that it differentiates genera, isn't this just a self-fulfilling prophecy, with people revising the contents of genera because they don't fit the method that supposedly differentiates genera? Does it actually differentiate a speciose genus (e.g. 100 species) just as good as a monospecific genus? These are things that need to be tested before making such grand claims.

Nevertheless, that is not the biggest problem with this method. The vast diversity of life is outside of Vertebrata. The method does not have any relevance here, and one of the biggest arguments about the artificiality of ranks is that they do not apply equally across the tree of life. This problem is not solved by this method.

And yes, I'm cranky today.

Never at a loss for words,
Randy

Generic hostility, Part 1

The other day I sent this around to the usual targets at Berkeley.

-----------------------------------------------

Subject: Cryptic species and real genera; also, lasers

Hi all,

I promised to send this to Brian and Alan, but then I realized that many more of you might be interested. Here's the news:

Certain bone proteins appear to be genus-specific.

Before you gun me down with your quick-draw "What's a genus?" Wake&Mishler revolver, just hang on a sec. If these proteins behave as advertised, we might finally have an objective basis for recognizing genera. AND we might be able to identify otherwise meaningless chunks of bone, which could be of some use to neontologists, and would definitely be of great use to paleontologists. Read all about it here.

Also, I'd be remiss if I didn't point out that there have been several new additions to the "Write this way" thread on my blog, in the form of a fourth entry, by one Randy Irmis, and a long comment on same, by one Darren Naish. Yes, my intellectual pool is highly inbred--but it sure is active! The floor is now open for jokes attributing the same qualities to my gene pool.

Cheers,

Matt

P.S. The bone proteins mentioned above are sequenced with matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS). When my subject line promises lasers, I deliver.

Tuesday, March 07, 2006

Write this way, Part 5

I started out with the assertions that (1) there are idea papers and data papers, and (2) fewer good ideas would rot in our private intellectual vaults if people had some incentive to part with their ideas (i.e., credit).

Happily, both assertions have come under fire--happily because I wouldn't have had all these interesting thoughts if they hadn't.

Eric and Randy have written to protest the artificial division between ideas and data--but I notice that both of them went on to talk about ideas and idea papers as if the division is real. :-) Any taxonomy is going to simplify and distort its subjects. That's life. But we accept such simplifications and distortions if the categories are useful. Sure, ideas and data are inextricably linked. But there seems to be general agreement that there are papers that are idea-heavy and data-light, and other papers that are idea-light and data-heavy.

I think where this has all gone wrong is with the assumption that I'm making a value judgement when I say that idea papers and data papers should be treated differently. We're used to living in a system where fast-track journals are more prestigious, so it's natural to assume that anything that is fast-tracked must be favored.

All I started with was an observation, with no relative merit implied: ideas die faster than data. Some ideas die while they're waiting for data. Maybe we should let the ideas out of their cages and see how far they can fly in their pathetic little mayfly lives.

One possible outcome would be that we find out that ideas mostly suck. I'm open to that. In fact, if ideas are flying around freely and data is still grinding out the old-fashioned way, it's probably inevitable that ideas will be valued less than data. I'm fine with that, too. For one thing, if the average credit for ideas goes down, then the slackers will have less incentive to deluge us with their empty thoughts. Also, if the credit for ideas goes down in value, the credit for data will rise. That might actually help counter the trend that Randy is lamenting--that it's getting hard to publish data without one or two sexy ideas slapped on the introduction and conclusion like cheap paint. Isn't that how buzzwords get buzzy?

The more I think about it, the more it seems inevitable that the system I'm proposing would lead to ideas being devalued and data being, er, up-valued. I honestly don't see why I can't have my cake and eat it, too. Undoubtedly if my system was implemented there would be an initial glut of ideas--the flood of crap that Randy and Darren are worried about (and rightly so). But that very flood of crap is going to do two things: drive down the average value of a free-floating idea, which will remove the incentive to publish crap ideas; and rapidly exceed everyone's tolerance for bad ideas. In the aftermath of the initial crap-tsunami, I think we'd have a world in which:
(1) the reward for publishing crap ideas would be nil;
(2) everyone would have a better-tuned crap detector, from having to sort through so much of it, which
(3) would help good ideas get noticed, and
(4) would also emphasize the value of good data and good observations. It could even help
(5) reverse the decline in funding and publishing opportunities for basic natural history.

But all of those things are epiphenomena; all I was setting out to do was suggest a system in which
(6) good ideas would be more likely to bump into someone who could use them while they were still fresh.

Maybe this rosy little future wouldn't come to pass after all. Maybe the crap tsunami would go on for so long that people would get sick of it, and go back to a system like the one we've got now. It seems to me that the worst that could happen is that we end up back where we started. But the best that could happen would be a world where good ideas found people to work on them, and the work it takes to generate data was rewarded proportionately.

There, a nice little optimistic house of cards. Go on, knock it down. You know you want to.

P.S. In a comment on WTW3, Mike Taylor called me out. In short, he dared me to send a recently-completed manuscript to ArXiv. I thought of about a dozen reasons why I could puss out, but (surprisingly enough) I'm going to look into it. Stay tuned.

Sunday, March 05, 2006

Write this way, Part 4

From Randy Irmis, one of my labmates.

----------------------------------------

A couple of points regarding your and Eric's comments (feel free to post this on your blog):

- I would disagree that data & ideas are either ends of a spectrum. In fact, they are inextricably linked. Even the most idea-poor/data-rich papers are profoundly influenced by the ideas of the author who gathered the data. Think about it - how do we go about gathering and analyzing data; how do we set up our materials and methods? It's based on prior ideas of what we think is going on. Darwin realized this a long time ago when he said (in a different context):

"How odd it is that anyone should not see that all observation must be for or against some view if it is to be of any service."
- Charles Darwin in a letter to Fawcett (1861)

- I'm not so sure that the good ideas eventually outshine the bad, especially if 95% are bad (note that good ideas don't necessarily have to be correct - see below). And this has been true for a long time. Take for example Mendel. He figured out the laws of inheritance, but there was so much crap floating around, that no one paid attention to him for 50 years (people in England/France did have his reprints afterall). There are way more scientists and journals now that ever, so there is even more crap and conversely hidden good ideas that we have to contend with.

- Ideas are nothing without data. Let's look at continental drift. Wegener was ultimately correct that the continents do move around the earth (continental drift), but he didn't have the data to support it, and his mechanism for continental drift thus fell flat on its face. For this reason, people did not accept drift, and it took 40 years until plate tectonics, a mechanism supported by excellent data, was proposed, for people to accept continental drift. In the pantheon of science, I'd much rather have an incorrect idea supported by solid data than a correct idea supported by little or no idea. Again, another great Darwin quote:

"False facts are highly injurious to the progress of science, for they often endure long; but false views, if supported by some evidence, do little harm, for everyone takes a salutary pleasure in proving their falseness and when this is done, one path towards error is closed and the road to truth is often at the same time opened."
- Charles Darwin

- Finally, I'm not so sure that it's easier to publish data-rich papers. I actually feel as though in paleontology at least, it is much easier to publish idea papers because editors are always looking for contributions which will be high-impact and will be of broad interest to the journals readership. That's why it's getting harder and harder to publish simple descriptions, new occurrences, etc., because editors view this as low interest, whether or not the contribution is competent.

P.S., I'm in a quoting mood this morning, so I'll end with another great
quote that always humbles me:

"We're not working in a vacuum where we suddenly get plopped on this planet and say, 'Nobody has thought about this before.' You can be sure that almost any idea you have, people have thought about it before. Maybe they didn't write about it, maybe they didn't pursue it. It's very humbling, because in a sense there's nothing really to invent. There are only things to be perceived and interpreted. It's a question of awareness and saying, 'Am I getting all the messages there? Am I putting all these pieces together in the proper way?' If you're not, you're not making progress."
- Bernard Chouet

Regards,
Randy

Friday, March 03, 2006

Write this way, Part 3

My reply to Eric. He's already promised a response. And please don't feel that you have to just spectate here--feel free to send in your own thoughts, either as e-mails to me or as comments on the blog posts.

----------------------------------------

Hi Eric,

Thanks for the response.

First things first. Do you mind if I post all this on my blog? I think it would be cool to have it all out in the open for people to follow. Especially since we're not trying to beat each other down, and because (I think) we have substantial points we agree on, as well as substantial differences.

Re your point about data and theory being intertwingled. You're right, it's not a dichotomy. It's probably a spectrum, with purely conceptual stuff at one end (although there will always be some "contamination" from empirically-established facts) and purely empirical stuff at the other (although almost always in the service of one idea or another). I think that it is easier to get more empirical stuff published. In fact, it may be quite hard to get an idea out to a broad audience unless it is piggybacked on some supporting data. That's good to the extent that it filters out some of the loonier ideas, but it's bad because it slows down the dissemination of ideas in general. I argue that the bad outweighs the good; we're already pretty good at screening out the bad ideas on our own. If I could be exposed to more ideas, faster, for the price of having to screen out more nonsense, I'd do it.

Cornell's Paul Ginsparg made a similar comment about peer review in the age of arXiv: "The role of refereeing may be over-applied at present, insofar as it puts all submissions above the minimal criterion through the same uniform filter. The observed behavior of expert readers indicates that they don't value that extra level of filtering above their preference for instant availability of material 'of refereeable quality.'"

You also make a good point about half-lives. I have John Bell Hatcher's monographs on Diplodocus and Haplocanthosaurus on the shelf, and I use them a lot. They were published in 1901 and 1903, respectively. The concepts surrounding dinosaur biology have changed tremendously in the intervening century, but the data--the observations--are just as useful as they ever were. Data are immortal, assuming that they don't get lost by the whole field, or superceded. Concepts have a faster burn time. I think that in the public sphere there is a half-joking, half-serious perception of scientists changing their theories every five minutes, every time new data comes in. That's probably too optimistic--I wish we could be perceived that well.

(It's funny to come to this so naturally in the course of our discussion, because this epigram has been rattling around in my head all week:
"Critics often make fun of scientists for changing our theories so often. Let's get something straight: the ability to change our ideas in the face of new knowledge is not a weakness of science. It is not even a strength. It is science."
Matt Wedel, 2006, thank you very much.)

What I'm arguing is that because ideas have shorter half-lives, it's stupid to make them wait around for data. Let's get them out there as fast as possible. The more ideas we have in the public pool, the richer we all are. Even if 95% of them are junk. Even if a lot them don't have any data to support them yet. If the ideas are any good, they'll attract data. Heck, if they're particularly bad they may attract data, just so people can bury them.

it's good to write ideas quickly because it's easy to forget them. I agree with you in principal here. But the reason for my agreement is NOT because we need to have some record for posterity so that philatelic historians can sift through the records to attribute the original idea to the author.

Hell no! I'm in perfect agreement with you here. Giving credit for ideas is not about hoarding credit (although some people would no doubt treat it that way).

Ideas are r-selected. Data are k-selected. But right now we're forcing ideas to propagate as slowly as data. I want ideas to travel faster. The problem is, how do you get someone to part with an idea? There has to be some kind of reward. Under the current system, you usually only get rewarded for an idea if you also gather up the data to support it. That's time consuming, and wasteful, because no one can gather enough data to test all of their ideas, and because the ideas are just waiting around in the meantime. Let's give just enough credit to get people to part with their ideas while they're still fresh, so someone else can get on with the testing--or so someone else can take the idea in an unexpected direction.

One consequence of this, which I forgot to point out in my original post, is that we may find out that ideas really are cheap. If we instituted a system like the one I'm proposing, maybe everybody would fire off all of their good ideas in the first few weeks, and then we'd all realize that the good ideas were occurring to everyone at the same time, or that there just weren't that many good ideas out there, period. In which case the value of "creativity credit" would go down, and people would have less incentive to publish ideas without data. We could very easily evolve back to a system like the one we use now.

Although I admit that that's possible, I don't think it's likely. Almost every advance in my own research has come from an utterly serendipitous contact with a new idea. Maybe I finally get around to reading that paper that I photocopied two years ago, or a conversation with a stranger makes me see my research from a new angle, or I see a talk at a meeting and realize that I could solve that guy's intractable problem with the data I already have in hand. I suspect that everyone else's research progresses the same way. We're like molecules bouncing around in idea space, but right now the number of collisions is being kept artificially low, because we're forcing all of the particles to travel at the speed of the slowest ones (i.e., data papers).

I'm just proposing that we turn up the heat.

Write back--let's keep this going.

--------------------------------------------

Postscript

I realized too late that I didn't reference Eric's thoughts on how we 'perform' ideas, but I think he's onto something. You probably know of one or two good ideas in your field that no-one paid any attention to, either for a long time or forever, no because they were bad ideas but because the author(s) communicated them poorly. This bears further examination.

Write this way, Part 2

Hi all,

Eric Harris sent this thoughtful response to my piece on scientific publication (two entries down), and has given me permission to post it here. My reply comes next. Eric's promised a response to that, too, so stay tuned.

----------------------------

Hi Matt,

I couldn't resist replying - this stuff is fun to think about. This semester I've been trying to be a devil's advocate by trying to disagree as much as possible, until I can't think of any counter-arguments. If the response to my disagreements are strong enough, I reject my criticisms. That's what I've tried to do here...(for the sake of fun, of course).

My understanding of your argument:

1) scientific publications can be separated into two gropus: 'ideas' vs. 'data'. Both are published in the same kind of journals.
2) the realm of ideas changes more quickly than the realm of data.
3) we need to publish 'ideas' papers more quickly to have a record of proper authorship and to keep current.

1) I think that 'ideas' and 'data' are more closely linked than you acknowledge. How can you ever get a piece of data uncloaked by the web of techniques, theories, instruments that were used to make the data? If you look within the horizon of surrounding decades it may appear that there are data that can pass through the years, unattached to ideologies that supported them. But try centuries - does a collection of plant information in an herbal written in the 1600s have data that appears free of ideology? Additionally, I'd argue that most 'conceptual' papers (in biology) are rooted in empricism and supported by data - is Darwin's Origin of Species solely conceptual? Ultimately though, I guess a review paper is different from a primary source (even BIOSIS knows that). I just feel that the further we look through history (or across cultures), the more this distinction starts to blur.

2) Even if 'data' and 'ideas' are separate - why does it seem that the half-life of an idea is much shorter than data? I like thinking of the dichotomy between "the problem" and "solutions to the problem". Take your example - bird teeth. The problem: why did birds evolve to not have teeth? The (possibly incorrect) solution: teeth are too heavy for flying. The idea to make 'bird evolution' a problem is different from the idea about the solution to the problem. I would argue that changes in what constitutes a problem (e.g. the nature of bird evolution - or even evolution in general) take much longer than changes in what constitutes a solution to a problem (e.g. the specifics of how that evolution happened).

"But I think most concept papers should be written and submitted as fast as possible;
if it takes more than a month to make that happen, the ideas are probably getting stale
(or maybe we're just unused to the idea of moving so fast)."

This seems to be the crux of your point about adopting a faster means of publishing ideas. I think that if an idea is updated in the year that it takes to publish the paper, it was a probably a crappy idea in the first place. And if someone lucks out to publish the same thing first, well the time was right. Though I'm mixed about this - I'd be pissed if someone scooped me, but I also feel that knowledge is a cultural institution, with ideas arising at the proper time and place, almost making the idea of 'individual authorship' moot. Small consolation if it happened to me, though.

3) I think that people should keep their ideas written down. And (if my brain is any indication) it's good to write ideas quickly because it's easy to forget them. I agree with you in principal here. But the reason for my agreement is NOT because we need to have some record for posterity so that philatelic historians can sift through the records to attribute the original idea to the author. The act of creation is one of performance. An idea is not an immutable corpuscule lying in the bottom of the pool of the subconsciousness, waiting to be found. An idea is a changeable, malleable substance that is enacted. By writing our thoughts down, by finding the proper arena for their articulation, by engaging in an appropriate career for expression, by interacting with others in a social setting, by attempting to convince, and so on - we perform an idea and if successful, hopefully influence the current state of understanding. Otherwise it may just drop to the ground like the coke-bottle in The Gods Must be Crazy. I guess what I'm trying to say is that if you get a creativity credit for just writing a thought down on a piece of paper, you better as hell get some kind of chocolate medallion for doing the work to get it in the textbooks.

(I've been waiting to use the cokebottle reference for a while)
thanks for the thought-provoking writings,
-eric

Wednesday, March 01, 2006

My new favorite one-liner

From this excellent post at this excellent blog:

"If you've never written a sentence fragment."

Write this way

Note: I wrote this to my fellow grad students in Integrative Biology at Berkeley. If you're not a biologist, I may not be talking to you. But feel free to read on and find out.

-------------------------------------------

We write at least two kinds of papers. There are conceptual/theoretical papers, which push things forward by advancing new ideas, and data-heavy papers, most of which are written to either validate or contradict ideas already in existence--often, ideas that were first floated in conceptual/theoretical papers. And of course there are papers that do both, but I think there are relatively few of those; that is, a lot of data papers are prostituted in a skimpy cloak of buzzwords, but few present genuinely new ideas. At least in my field. Your mileage may vary.

That's nothing new, and if that's all I had to say I wouldn't have written this.

The two kinds of papers get written differently, too. Conceptual papers can sometimes be banged out in an afternoon, especially if the ideas have been knocking around the author's head for a while. You already know what the other kind are like, because you're writing them right now. To take a substantive data paper from conception to submission in under a month is almost inconceivable; to do the same in under a year is still remarkable. But I think most concept papers should be written and submitted as fast as possible; if it takes more than a month to make that happen, the ideas are probably getting stale (or maybe we're just unused to the idea of moving so fast).

Maybe we should think about publishing the two kinds of papers differently. By the time a data study has been completed and written up as a data paper, any contained ideas are years old anyway. The delays inherent in the current system of academic publishing are irritating but not crippling. But they might be, for many conceptual papers. One of the great lessons of life is that if you thought of it, someone else could, too. The closer you are to the cutting edge, the more you should worry about someone else beating you into print. Another way of putting that is, if you're not worried about someone beating you into print, maybe you should be working on something more important (then again, maybe you're so far ahead of the field that you don't have to worry about immitators).

Fast-track journals are not ideal for data papers. It takes too much stuffing to get all those details into three or four pages. Online publication loosened the girdle, allowing the jiggling fat of data to spill out as supplementary information, which is often many times longer than the printed paper. But Science and Nature are the gatekeepers to the land of jobs and tenure, and we all know it.

In a rational system we might keep fast-track journals mainly for conceptual papers, or for data papers that are most urgent and will suffer least by being condensed, and "regular" journals for most data papers, and everyone would understand that the two kinds of journals were not to be segregated by importance of work published therein. But I suspect that it wouldn't work. As long as there are fast-track and slow-track journals, everyone will want to be in the fast-track journals, and that yearning will drive the economy of scientific publication.

What if we did away with journals entirely? Ever heard of arXiv?

If you have, don't worry, this will be short. If you haven't, arXiv is an e-print archive for mathematical and scientific papers. It's geared towards math, phyiscs, and computer science, but I suspect that's an artifact of history; there's no reason arXiv wouldn't work for any field*. Most papers in those fields are "published" on arXiv as soon as they're written, usually in advance of or concurrent with submission to a journal (there is a minimal amount of screening to make sure the papers aren't complete garbage). This has had a big effect on what journals in those fields are for. Instead of being primarily for the dissemination of knowledge, the journals are now mostly sources of status. They confer a sort of legitimacy on the papers they publish. Meanwhile, the rest of the field has long since digested the new information and moved on. The role of "I had the idea first and I can prove it!" claim-staking has moved from the journals to arXiv--which, given the number of abuses of the peer-review system that I know of, has got to be a good thing.

* Some divisions of the humanities may never adopt such a system, either because their members lack the technical competence, or because instant distribution of papers would only highlight their absence of content.

What if "I had the idea first and I can prove it!", henceforth called creativity credit, expanded to take in blogs? It's not as crazy as it might sound. People are already citing physics and math blogs in the comments on arXiv. To my mind, it's only a matter of time before links to blogs are included in the arXiv papers themselves, and once they're in arXiv there is probably no barrier to getting them into print.

I'm sure that each of you is sitting on an idea or two, or maybe a dozen, that you haven't told anyone about. You haven't had time to do the work, but the idea is promising enough that you don't want to just give it away, either. Maybe you'll get to it yourself someday. Maybe you'll farm it out to a friend, collaborator, or a grad student of your own.

But maybe you won't. Maybe you'll never get around to doing anything with it, or maybe someone else will take the idea to fruition in the meantime. What a waste! Either the idea sits around, going unused by the community, or someone else has it and does the work and takes the credit. Wouldn't it be nice to get creativity credit without having to do the work? It would be sorta like giving the idea to one of your students, only better, because you could give it to _any_ student, and you'd get some credit without having to force your name onto a paper that your student wrote (I detest advisors that do that, I hope you're not working with one, and I hope you don't become one).

For example, open any ornithology textbook and you'll read that birds lost their teeth to save mass and improve flight performance. What a load of crap! Enamel is dense stuff, but compared to the mass of the whole body the mass of Archaeopteryx's teeth was trivial. Never mind than an entire radiation of Cretaceous birds had teeth and did just fine with them, or that a lot of extant birds have monster beaks that weigh many times what their ancestral teeth did anyway. As far as I can tell, every group of vertebrates that ever evolved a beak got rid of their teeth shortly thereafter (ornithischian dinos did keep their teeth--but not in the beak part of the jaw). I don't know why birds traded teeth for beaks, but it seems obvious that they did, and that it had nothing to do with making their heads lighter. Admittedly, I haven't done a lot of heavy lifting here, but if the unsupported-and-obviously-wrong hypothesis is good enough to be textbook boilerplate, then my unsupported-but-probably-right hypothesis is surely good enough to publish as is. But I haven't published it, and neither has anyone else. I assume no one else has published it either because they haven't noticed it, or because (like me) they've got better things to do than scrounge up the data to support it.

Some of you are probably thinking, "Fuck you! If you're not willing to do the work, why should you get credit for the idea?" It's not that I don't value grunt-work, or producing data papers. I do. Generating reams of data is often a good way to discover new ideas, and publishing those reams of data is good science, because everyone can play with the data and maybe come up with surprising new insights.

But if there's no credit for ideas themselves, people have no incentive to distribute them. We're back to sitting on our ideas until a likely grad student comes along...in a decade or two.

What I'd like is for ideas and work to be rewarded differently, and independently. I think that distribution of new ideas instantly, by way of some sort of Bio-arXiv, which might in turn link to or encompass the time-stamped blogs of everyone in the field, would be good for several reasons.

1. It would be a great source of project ideas for new students or for people looking for a change of pace. I've always admired people who list outstanding problems at the end of their papers, for just that reason.
2. It would be a good idea-sorter. If all the ideas currently in existence were freely available to everyone, and some of them still weren't being worked on, it might indicate that they're just not very interesting. Or maybe that they're extremely interesting, but hard to tackle empirically. You see? This would give a whole new way to parse the big ideas in our fields and pick out the promising avenues from the less promising.
3. It would be less wasteful, of time and effort. No more waiting years for new ideas to emerge. No more sinking years into a project only to find out that someone else had already solved the problem. And--hopefully--no more good ideas and good data languishing in unpublished theses and dissertations. With an arXiv-like system, filing and e-publication could be synonymous.

I had a few other pros lined up, but it's late and they've already slipped out of my mind. Too bad I didn't arXiv them. :-)

Finally, I don't think the question is, "Will biology catch up with arXiv-world? It's not even, "When?" It's "What am I going to do to get ready for the transition?"

Here's a tip: start writing down your ideas.

Responses welcome.

Ask Doctor Vector