Long time no blog! And, yes, as with most grad school bloggers that was initially out of too much work, distraction, and a touch of laziness. But more recently, it’s because I’ve started a new project: podcasting! It’s called Unsupervised Thinking (a play off “unsupervised learning” in machine learning) and it’s a podcast about neuroscience, artificial intelligence, and science more broadly. And since I and my two fellow podcasters are PhD students in computational neuroscience, it’ll have a computational/systems bend.
Our first episode is on Blue Brain/Human Brain Project, which is the large EU-funded project to simulate the brain in a computer. Our next episode will be on brain-computer interface. Check it out by clicking below!
This is a piece about the present state, and potential future, of fraud in scientific research which I wrote for a Responsible Conduct in Research course taught at Columbia.
There seems to be a trend as of late of prominent scientific researchers been outed for fabrications or falsifications in their data. Diederik Stapel’s extravagant web of invented findings certainly stands out as one of the worst examples, and will probably do long term damage to the field of psychology. But psychology is not alone; other realms of research are suffering from this plague too. For example, the UK government exercised for the first time its right to imprison scientific fraudsters when it sentenced Steven Eaton to 3 months for falsifying data regarding an anti-cancer drug. And accusations of fraud fly frequently from both sides of the debate over climate change. Studies would suggest these misdeeds aren’t limited to just the names that make the news. In an attempt to quantify just how bad scientists are being, journalists sent out a misconduct questionnaire to medical science researchers in Belgium. Four out of the 315 anonymous respondents (1.3%) admitted to flat out fabrication of data and 24% acknowledged seeing such fabrication done by others. Furthermore, analysis of publishing practices has shown a steep increase in the rate of retractions of journal articles since 2005, and investigations suggest that up to 43% of such retractions are due to fraud, with an additional 9.8% coming from plagiarism. It seems clear from both anecdotes and analysis, dishonesty abounds in the research world.
But as with any criminal activity, it is hard to really know how accurate statistics on fraud in scientific publishing can be. Is this wave of retractions and public floggings really a result of an increase in inappropriate behavior, or just an increase in the reporting of it? In other words, are we producing more scientists who are willing to lie, cheat, and steal to get ahead, or more who are willing to sound the alarm on those who do?
Certainly the current financial climate creates an incentive, a need even, for a researcher to stand out from the crowd of their peers like never before. To secure funding from grants, publications highlighting hot-topic research findings are a must. The less money going into science, the more competition there is for grants. So, those research findings must become hotter and more frequent. Furthermore, much of the same “high impact publication”-based criteria is used for determining who gets postdoc positions, assistant professorships, and even tenure. This kind of pressure could, and apparently does, lead some scientists to fake it when they can’t make it.
But while today’s economy may make it easier to justify cheating, today’s technology can make it harder to execute it. We have the ability to automatically search large datasets for the numerical anomalies or repetitions that are hallmarks of fabrication. The contents of an article can be compared to large databases of text to catch a plagiarized paragraph before any human eyes have read it. And the anonymity of the internet provides a way for anyone to report suspicious behavior of even the most senior of scientists without fearing retribution. Thus, it may seem obvious that case after case of fraud is being exposed.
No matter the specific reasons for this recent uptick, misconduct in research is something that always has been and always will be with us. In any competitive situation, with glory and profit on the line, some people will turn to deceit to get ahead. So what can we do reduce the number of wrong-doers to the lowest possible? Well certainly the technological tools mentioned above can help. And some may argue that we should go further, and implement as much surveillance of scientists during their data-collecting as possible. Oversight can prevent the usually solitary scientist from engaging in any “data massaging” that they may have considered when no one was looking. Pre-registration of studies is another tool to ensure experimenters aren’t trying to fiddle with or cover up unsavory data. By stating, before the experiment even begins, what is meant to be tested and how, researchers will be less able to squeeze out whatever p<.05 trends they can find in the data and pretend that’s what they were looking for all along.
While such tools can be effective in preventing the deed of fraud, I think, as a field, we would be better served by preventing the motivation for fraud. This means moving away from a funding system that puts unreasonable weight on flashy results and towards one that favors critical thinking, solid methods, and open data/code sharing. We will need to learn to evaluate our peers by this same criteria as well. Furthermore, our publishing process has to make room for the printing of negative results and replicated studies. The scientist who accidentally stumbles upon an intriguing finding shouldn’t necessarily be praised higher than those who attempt to replicate a result they find suspicious or who have spent years tediously testing hypotheses which turn out to be incorrect. Certainly positive novel findings will continue to be the driving force of any field, and this explains them taking precedence when publishing resources were limited. But with today’s online publishing and quick searches, there is little justification for ignoring other kinds of findings. Additionally, it is now possible for journals to host large datasets and code repositories online along with their journal articles, allowing researchers to get credit for these contributions as well. Technological advancements can be used not only to catch fraud, but to implement the changes that will prevent the motivation for it as well.
Of course, incorporating these achievements will require a more complex means of evaluating scientists for grants and promotions, and this will take time. But it is crucial that we start We need to create a culture that recognizes the importance of a good scientific process and the extreme harm done by introducing dishonesty into it. The hierarchical nature of science, with new studies being built on the backs of old ones, means that one small act of fraud can have far-reaching and potentially irreversible effects on the field. Furthermore, it damages the reputation of scientific research in the public eye, which can lessen confidence and support. People may have been upset to learn of Jonah Lerner’s fraudulent reporting of neuroscience, but such concerns pale in comparison to learning of the fraudulent conducting of neuroscience. While fraud and data manipulation are hardly new problems, there can always be new solutions for combating them. We are lucky to live in an age that allows us the tools to detect such practices when they occur, and also to change the system that encourages them. While it is unlikely that we will ever fully eradicate scientific misconduct, we can hope to create a culture amongst scientist that makes dishonesty less common and that views fabrication as an unthinkable option.
Van Noorden, R. (2011). Science publishing: The trouble with retractions Nature, 478 (7367), 26-28 DOI: 10.1038/478026a
The range of tools used to study the brain is vast. Neuroscientists toss together ideas from genetics, biochemistry, immunology, physics, computer science, medicine and countless other fields when choosing their techniques. We work on animals ranging from barely-visible worms and the common fruit fly to complicated creatures like mice, monkeys, and men. We record from any brain region we can reach, during all kinds of tasks, while the subject is awake or anesthetized, freely moving or fixed, a full animal or merely a slice of brain…and the list goes on. The result is a massive, complex cocktail of neuroscientific information.
Now, I’ve waxed romantic about the benefits of this diversity before. And I still do believe in the power of working in an interdisciplinary field; neuroscientists are creating an impressively vast collection of data points about the brain, and it is exciting to see that collection continuously grow in every direction. But in the interest of honesty, good journalism, and stirring up controversy, I think it’s time we look at the potential problems stemming from Neuroscience’s poly-methodological tendencies. And the heart of the issue, as I see it, is in how we are connecting all those points.
When we collect data from different animals, in different forms, and under different conditions, what we have is a lot of different datasets. Yet what we seem to be looking for, implicitly or explicitly, are some general theories of how neurons, networks, and brains as a whole work. So, for example, we get some results about the molecular properties needed for neurogenesis in the rat olfactory bulb, and we use these findings to support experiments done in the mouse and vice versa. What we’re assuming is that neurons in these different animals are doing the same task, and using the same means to accomplish it. But the fact is, there are a lot of different ways to accomplish a task, and many different combinations of input that will give you the same output. Combining these data sets as though they’re one could be muddling the message each is trying to send about how its system is working. It’s like trying to learn about a population with bimodally distributed variables by studying their means (see Fig 1). In order to get accurate outcomes, we need self-consistent data. If you use the gravity on the Moon to calculate how much force you need to take off from the Earth, you’re not going to get off the ground.
Not to malign my own kind, but theorists, with their abstract “neural network” models, can actually be some of the worst offenders when it comes to data-muddling. By using average values for cellular and network properties pulled from many corners of the literature, and building networks that aren’t meant to have any specific correlate in the real world, modelers can end up with a simulated Frankenstein: technically impressive, yes, but not truly recreating the whole of any of its parts. This quest for the Platonic neural network—the desire to explain neural function in the abstract—seems, to me, misguided. Rather, even as theorists, we should not be attempting to explain how neurons do what they do—but rather how V1 cells in anesthetized adult cats show contrast invariant tuning, or how GABA interneurons contribute to gamma oscillations in mouse hippocampal slices, and so on. Being precise in determining what our models are trying to be will better fuel how we design and constrain them, and lead to more directly testable hypotheses. The search for what is common to all networks should be saved until we know more of what is specific to each.
Eve Marder at Brandeis University has been something of a crusader for the notion that models should be individualized. She’s taken to running simulations to show how the same behavior can be produced by a vast array of different parameter values. For example, in this PNAS paper, Marder shows that the same bursting firing patterns can be created by different sets of synaptic and membrane conductances (Fig 2). This shows how simply observing a similar phenomenon across different preparations is not enough to assume that the mechanisms producing it are the same. This assumption can lead to problems if, in the pursuit of understanding bursting mechanisms, we measured sodium conductances from the system on the left, and calcium conductances from that on the right. Any resulting model we could create incorporating both these values would be an inaccurate explanation of either system. It’s as though we’re combining the pieces from two different puzzles, and trying to reassemble them as one.
Now of course most researchers are aware of the potential differences across different preparations, and the fact that one cannot assume that what’s true for the anesthetized rat is true for the behaving one. But these sorts of concerns are usually relegated to a line or two in the introduction or discussion sections. On the whole, there is still the notion that ideas can be borrowed from nearby lines of research and bent to fit into the narrative of the hypothesis at hand. This view is not absurd, of course, and it comes partly from reason, but also from necessity: there’s just some types of data that we can only get from certain preparations. Furthermore, time and resource constraints mean that it is frequently not plausible to run the exact experiment you may want. And on top of the practical reasons for combining data, there is also the fact that evolution supports the notion that molecules and mechanisms would be conserved across brain areas and species. This is, after all, why we feel justified in using animal models to investigate human function and disorder in the first place.
But, like with many things in Neuroscience, we simply can’t know until we know. It is not in our best interest, in the course of trying to understand how neural networks work, to assume that different networks are working in the same way. Certainly frameworks found to be true in specific areas can and should be investigated in others. But we have to be aware of when we are carefully importing ideas and using evidence to support the mixing of data, and when we’re simply throwing together whatever is on hand. Luckily, there are tools for this. Large scale projects like those at the Allen Brain Institute are doing a fantastic job of creating consistent, complete, detailed, and organized datasets of specific animal models. And even for smaller projects, neuroinformatics can help us keep track of what data comes from where, and how similar it is across preparations. Overall, it needn’t be a huge struggle to keep our lines of research straight, but it is important. Because a poorly mixed cocktail of data will just make the whole field dizzy.
Marder, E. (2011). Colloquium Paper: Variability, compensation, and modulation in neurons and circuits Proceedings of the National Academy of Sciences, 108 (Supplement_3), 15542-15548 DOI: 10.1073/pnas.1010674108
Pursuing rewards is a crucial part of survival for any species. The circuitry that tells us to seek out pleasure is what ensures that we find food, drink, and mates. In order to engage in this behavior, we must learn associations between rewards and the stimuli that predict them. That way we can know that our caffeine craving, for example, can be quenched by seeking the siren in a green circle (it’s possible that I do my blog writing at a Starbucks–cuz I’m original like that). Studying this kind of reinforcement learning is big business, and there is still a lot left to find out. But what has been known for some 15 years now is that dopaminergic cells in the midbrain which encode reward value also encode reward expectation. That is, in the ventral tegmental area (VTA), cells increase their firing in response to the delivery of an unexpected reward, such as a sudden drop of juice on a monkey’s tongue. But cells here also fire in response to a reward cue, say a symbol on a screen that the monkey has learned predicts the juice reward. What’s more, after this cue, the arrival of the actual reward causes no change in the firing of these cells unless it is higher or lower than expected. So, these cells are learning to value the promise of a pleasurable stimulus, and signal whether that promise is fulfilled, denied, or exceeded. Suddenly, the sight of the siren is a reward on its own, and getting your coffee is merely neutral.
But the world is rarely just a series of cues and rewards. It’s complex and dynamic: a symbol may predict something positive in one context and punishment in another; reward contingencies can be uncertain or change over time; and with a constant stream of incoming stimuli how do you even figure out what acts as a reward cue in the first place? Luckily, Ethan Bromberg-Martin and Okihide Hikosaka are interested in explaining just these kinds of challenges, and they’ve made a discovery that offers a nice framework on which to build a deeper understanding. In this Neuron paper, Bromberg-Martin and Hikosaka developed a task to test monkeys’ views on information. To start, the monkey was shown one of two symbols, A or B, to which the he had to saccade. After that, one of a set of four different symbols appeared: if A was initially shown then the second symbol would be A1 or A2, and likewise for B. The appearance of A1 always predicted a big water reward, and A2 always predicted a small water reward (which, to greedy monkeys who know a larger reward is possible, is essentially a punishment). But for B1 and B2, the water amount was randomized; these symbols were useless in providing reward information. So, the appearance of A meant that an informative symbol was on its way, whereas B meant something meaningless was coming. Importantly, the amount of reward was equal on average for A and B, it was only the advanced knowledge of the reward that differed.
Recording from those familiar midbrain dopaminergic cells, the authors saw an increase in activity following the appearance of the information-predicting cue A, and a decrease in response to B. These cells then went on to do their normal duty: showing a large spike in response to A1 (the large reward cue), a decrease to A2, and no change in response when these predicted rewards were actually delivered; or, alternatively, little change in response to B1 and B2, and a spike/dip when an unpredictable large/small reward was delivered. What the initial response to A and B shows is that the VTA is responding to the promise of information about reward in the same way is it responds to the promise of a reward or a reward itself. This is further supported by the fact that when monkeys were presented with both A and B and allowed to choose which to saccade to, they overwhelming preferred A—leading them down the path of reward information.
This may seem like a silly preference. Choosing to be informed about the reward size beforehand doesn’t provide a greater reward size or allow the monkey any more control, so why bother valuing the advanced information? The authors put forth the notion that uncertainty is in someway uncomfortable, so the earlier it is resolved the better. But I’m more inclined to believe their second assertion: the informative path (A) is preferred because it provides stable cue-reward associations that can be learned. The process of learning what cue predicts what reward assumes that there are cues that actually do predict reward. So if we want to achieve that goal we have to make sure we’re working in a regime where that base assumption is true—this isn’t the case for uninformative path B. Living in a world of meaningless symbols means all your fancy mental equipment for associating cues and rewards is for naught, and it leaves you with little more than luck when it comes to finding what you need. So there is a clear evolutionary advantage in finding reward in (and thus seeking out) stable cue-reward associations.
But like most good discoveries, this one leaves us with a lot of questions, mainly about how the brain comes to find these stable associations rewarding. We know that for a cue to be associated with a reward, it needs to reliably precede that reward. Then through….well, some process that we’re working out the details of….VTA neurons start firing in response to the cue itself. So presumably in order for the brain to associate a certain cue with reward information, the cue has to reliably precede that information. Here’s where we hit a problem. It is easy enough to understand how the brain is aware that the cue was presented (that’s just a visual stimulus, no problem there), and we can equally as well conceive of how it acknowledges the existence of a reward (again, just a physical stimulus which ends up making VTA cells spike), but how can the brain know that information is present? The information that a cue contains about an upcoming reward isn’t a physical stimulus out there in the world; it’s something contained in the brain itself. If we are to learn to associate an external cue with an internal entity like information, the brain needs to be able to monitor what’s happening inside itself the same way it monitors the outside world.
Luckily, there are possible mechanisms for this, and they fit well with the existing role of VTA cells. Here is the equation the brain seems to be using to make basic reward associations:
visual stimulus + VTA cell firing due to some delayed reward = VTA cell firing to visual stimulus.
But VTA cell firing is VTA cell firing, so we can substitute the second term with the righthand side of the equation and get:
visual stimulus #2 + VTA cell firing due to visual stimulus = VTA cell firing to visual stimulus #2
If pseudomath isn’t your thing: basically, the fact that the brain can learn to treat reward cues as reward means that it can learn to treat cues for reward cues as reward. And cues for cues for reward cues? Maybe, but I wouldn’t bet on it. While they did fire in response to the promise of information signified by cue A, the VTA cells still had their biggest spike increase in response to A1, the cue that signaled a big reward. It seems there’s a limit on how far removed a cue can be from an actual reward. Interestingly, this ability of any kind of metacognition appears restricted to more cognitively complex animals such as primates, and probably contributes to their adaptability as a species. While this kind of study hasn’t been done in rats or mice, my guess is you’d be hard-pressed to find such a preference for information in those lower animals.
Of course these findings leave us with something of a chicken-and-egg problem. Our desire for information is supposed to motivate us to pursue situations with stable cue-reward associations. But we can’t develop that desire until the cue-reward association is already mentally established, so what good is it then? There is also the question of how these results fit into the well-established desire that people (and animals) have for gambling. You’re not going to find a roulette wheel that will tell you where its ball is going to land, or a poker player willing to show you their cards. So what allows us to selectively love risk and uncertainty? Some theories suggest that the possibility for huge payoffs can lead to a miscalculation in expected reward and overpower our better, more rational instincts. But it’s still an area of research in economics as well as neuroscience. Basically, the evidence that reliable information is valued and sought after provides many insights into the process of reinforcement learning, but in order to fully understand its role and consequences, we are going to need more–you guessed it–information.
Bromberg-Martin, E., & Hikosaka, O. (2009). Midbrain Dopamine Neurons Signal Preference for Advance Information about Upcoming Rewards Neuron, 63 (1), 119-126 DOI: 10.1016/j.neuron.2009.06.009
Talking with fellow scientists, it would seem that most have a love/hate relationship with the current state of scientific publishing. They dislike the fact that getting a Science or Nature paper seems to be the de facto goal of research these days, but don’t hesitate to pop open the bubbly if they achieve it. This somewhat contradictory attitude is not altogether unreasonable given the current setup. The fate of many a researching career is dependent on getting a paper (or papers) into one of these ‘high impact’ journals. And as illogical as it seems for the ultimate measure of the importance of months of research to be in the hands of a couple of editors and one to three peer reviewers, these are the rules of the game. And if you want to get ahead, you gotta play.
The tides, however, may possibly be changing. Many smaller journals have cropped up recently, focusing on specific areas of research and implementing a more open and accepting review process. PLoS ONE and Frontiers are at the forefront of this. Since the mid-2000s, these journals have been publishing papers based purely on technical merit, rather than some pre-judged notion of importance. This leads to a roughly 70-90% acceptance rate (compared to Nature’s 8% and Science’s <7%), and a much quicker submission-to-print time. It also necessitates a post-publishing assessment of the importance and interest level of each piece of research. PLoS achieves this through article-level metrics related to views, citations, blog coverage, etc. Frontiers offers a similar quantification of interest, and the ability of readers to leave commentary. Basically, these publications recognize the flaws inherent in the pre-publication review system and try to redress them. PLoS says it best themselves:
“Too often a journal’s decision to publish a paper is dominated by what the Editor/s think is interesting and will gain greater readership — both of which are subjective judgments and lead to decisions which are frustrating and delay the publication of your work. PLOS ONE will rigorously peer-review your submissions and publish all papers that are judged to be technically sound. Judgments about the importance of any particular paper are then made after publication by the readership (who are the most qualified to determine what is of interest to them).”
So we have the framework for a new type of review and publication process. But such a tool is only helpful to the extent that we utilize it. Namely, we need to start recognizing and rewarding the researchers who publish good work in these journals. This also implies putting less emphasis on the old giants, Nature and Science. But how will these behemoth establishments respond to the revolution? Well, we may soon find out. NPG, the publishing company behind Nature has recently announced a majority investment in Frontiers. The press release stresses that Frontiers will continue to operate under its own policies, but describes how Frontiers and Nature will interact to expand the number of open access articles available on both sides. Interestingly, the release also states that “Frontiers and NPG will also be working together on innovations in open science tools, networking, and publication processes. “ A quote from Dr. Phillip Campbell, Editor-in-Chief of Nature is even more revealing:
“Referees and handling editors are named on published papers, which is very unusual in the life sciences community. Nature has experimented with open peer review in the past, and we continue to be interested in researchers’ attitudes. Frontiers also encourages non-peer reviewed open access commentary, empowering the academic community to openly discuss some of the grand challenges of science with a wide audience.”
Putting (perhaps somewhat naively) conspiracies of an evil corporate takeover aside, could this move mean that the revolution will be a peaceful one? That Nature sees the writing on the wall and is choosing to adapt rather than perish?
If so, what would a post-pre-publication-review world look like? Clearly if some sort of crowdsourcing is going to be used to determine a researcher’s merit, it will have to be reliable and standardized. For example, looking at the number of citations per article views/downloads is helpful in determining if an article is merely well-promoted, but not necessarily helpful to the community—or vice-versa. And more detailed information can be gathered about how the article is cited: are its results being refuted or supported? is it one in a list of many background citations or the very basis of a new project? Furthermore, whatever pre-publishing review the article has been submitted to (for technical merit, clarity, etc) should be posted alongside the article along with reviewer names. The post-publishing commentary will also need to be formalized (with rankings on different aspects of the research: originality, clarity, impact, etc) and fully open (no anon trolls or self-upvoting allowed). Making participation in the online community a mandatory part of getting a paper published can ensure that papers don’t go un-reviewed (the very new journal PeerJ uses something like this). If sites like Wikipedia and Quora have taught us anything (aside from how everything your mother warned you about is wrong), it’s that people don’t mind sharing their knowledge, especially if they get to do it in their own way/time. And since the whole process will be open, any “you scratch my back, I’ll scratch yours” behavior will be noticeable to the masses. The practices of Frontiers and PLoS are steps in the right direction, but their article metrics will need to be richer and more reliable if they are to take the place of big-name journal publications for a measure of research success.
Some people feel that appropriate post-publishing review makes any role of a journal unnecessary. Everyone could simply post their papers to an e-print service like arXiv, as many in the physical sciences do, and never submit to anywhere officially. Personally, I still see a role for journals—not in placing a stamp of approval on work, but in hosting papers and aggregating and promoting research of a specific kind. I enjoy having the list of Frontiers in Neuroscience articles delivered to my inbox for my perusal. And I like having some notion of where to go if I need information from outside my field. And as datasets become larger and figures more interactive, self-publishing and hosting will become more burdensome. Furthermore, there’s the simple fact that having an outsider set of eye’s proofread your paper and provide feedback before widely distributing it is rarely a bad thing.
But what then is left of the big guns, Science and Nature? They’ve never claimed a role in aggregating papers for specific sub-fields or providing lots of detailed information—quite the opposite, in fact. Science’s mission is to:
“to publish those papers that are most influential in their fields or across fields and that will significantly advance scientific understanding. Selected papers should present novel and broadly important data, syntheses, or concepts. They should merit the recognition by the scientific community and general public provided by publication in Science, beyond that provided by specialty journals.” [emphasis added].
Nature expresses a similar desire for work “of interest to an interdisciplinary readership.” And given that Science and Nature papers have some of the strictest length limits, they’re clearly not interested in extensive data analysis. They want results that are short, sweet and pack a big punch for any reader. The problem is that this isn’t usually how research works. Any strong, concise story was built on years of messy smaller studies, false starts, and negative results—most of which are kept from the rest of the research community while the researcher strives to make everything fall in place for a proper article submission. But in a world of minimal review, any progress (or confirmation of previous results, for that matter) can be shared. Then, rather than let a handful of reviewers (n=3? come on, that only works for monkey research) try to predict the importance of the work, time and the community itself will let it be known. Nature and Science can still achieve their goals of distributing important work across disciplines, but they can do it once the importance has been established, by asking researchers of good work to write a review encapsulating their recent breakthroughs. That way, getting a piece in Nature and Science remains prestigious, and also merited. Yes this means a delay between the work and publication in these journals, but if importance really is their criteria, this delay is necessary (has any scientist gotten the Nobel Prize six months after their discovery?). Furthermore, it’ll stop the cycle of self-fulfilling research importance whereby Nature or Science deems a topic important, prints about it, and thus makes it more important. This cycle is especially potent given the fact that the mainstream media frequently looks to Science and Nature for current trends and findings, and thus their power goes beyond just the research community. In a system where their publications were proven to be important, this promotion to the press would be warranted.
The goals of the big name journals are admirable: trying to acknowledge important work and spread it in a readable fashion to a broader audience. But their ways of going about it are antiquated. There was a time when getting a couple of the current thinkers in a field together to decide the merit of a piece of work was the only reasonable technique, but those days are gone. The data-collecting power of web-based tools is immense and their application is incredibly apropos here. We have the technology; we can rebuild the review system. NPG’s involvement with Frontiers offers hope (to the optimistic) that these ideas will come to fruition. We already know that Frontiers supports them. We just need the community at large to follow suit in order to give validity to these measures.
Since the vague reference to it in the State of the Union and the subsequent report by the New York Times, the neuro-sphere has been abuzz with debate recently over the proposed Brain Activity Map (BAM) project put forth by the Obama administration. While the details have not been formally announced yet, it is generally agreed upon that the project will be a ten-year, 3 billion-dollar initiative organized primarily by the Office of Science and Technology Policy with participation from the NSF, NIH, and DoD along with some private institutions. The goal is to coordinate a large-scale effort to create a full mapping of brain activity, down to the level of individual cell firing. It has been likened to the Human Genome project, both in size and in potential to change our understanding of ourselves.
But, while optimism regarding the power of ambitious scientific endeavors shouldn’t be discouraged (I’m no enemy to “Big Science”), it is important to ask what we really expect to get out of this venture. Let’s start with the scientific goals themselves. What is actually meant by “mapping brain activity”? The idea for this project is supposedly based on a paper published in Neuron in June (which itself stemmed from a meeting hosted by the Kavli Foundation). In it, the authors express the desire to capture every action potential from every cell in a circuit, over timescales “on which behavioral output, or mental states, occur.” This proposal is very well-intentioned, but equally vague. Yes, most any neuroscientist would love to know the activity of every cell at all times. And by focusing on activity we know that incidental features like neurotransmitters used or cell size and shape won’t be of much importance. But when enlisting hordes of neuroscientists to dedicate themselves to a collective effort, desired outcomes need to be made explicit. What, for example, is meant by a circuit? The authors give examples suggesting that anatomical divisions would be the defining lines (like focusing on the medulla of the fly). But given the importance of inter-area connections for computations, anatomical divisions aren’t always the best option for identifying and understanding a circuit with a specific purpose. Next, we have the notion of collecting data over the course of behavioral outputs or mental states. Putting aside the blatant opacity of these terms, there remains the question of what kind of ‘mental states’ and ‘behavioral outputs’ we want to measure. Is the desire to get a snapshot of the brain in some kind of ‘null state’ and then compare that to activity patterns that occur during specific tasks? But which tasks? And how many? Furthermore, the higher the complexity of the species and the task at hand, the more likely there are to be individual differences in the activity maps across animals. Whether or not taking the average appropriately captures the function of individual circuits is debatable. Finally, even the conceptually simple goal of recording every action potential is open to interpretation. Do we want actual waveforms? Most people consider spike times to be the bread and butter of any activity measure, but that still leaves open the question of how much temporal precision we desire. All of these seemingly minor details can have a large impact on experimental design and technique.
Assuming, however, that all these details are sorted out (as they must be), we’re then left grappling with our expectations over what this data will mean. A quick poll of the media might have you believe that a BAM is akin to a mental illness panacea, a blueprint for our cyborg future, and the answer to whether or not we have a soul. Most in the field are, thankfully, less starry-eyed. To us, a record of every cell’s activity will result in…a lot of data, and thus, the need to develop tools to understand that data. This will boost the demand for theoretical models that attempt to account for how the dynamics of neural activity can implement information processing. As the dataset of activity patterns increases, the more constrained, and presumably more accurate, these models will become. Ideally, this will eventually lead to an understanding of the mechanisms by which brains process inputs and produce outputs. Acknowledging this more sober goal as the true potential outcome of the BAM project, some neuroscientists feel that the loftier promises made by those outside the field (and even some within) are dishonest and, in a sense, manipulative. The fact is the acquisition of this data does not guarantee our understanding of it, and our understanding of it does not ensure the immediate production of tangible benefits to society. To make a direct link between this project and a cure for Parkinson’s or the advent of downloadable memories is fraudulent. These are not the immediate goals. But science is of course a cumulative process, and the more we understand about the brain the better equipped we are to pursue avenues to treat and enhance it. The creation of a BAM is, I believe, a good approach to advance that understanding and thus has the potential to be very beneficial.
But even if the anticipated long-term and philosophical results of Obama administration’s project are hazy, there are some more concrete benefits expected to come out of it: mainly, advances in all kinds of technologies. The notion that the government sinking large sums of money into a scientific endeavor leads to economic and technological progress is well-agreed upon and fairly well-supported. And the nature of this project indicates it could have an effect on a wide variety of fields. Google, Qualcomm, and Microsoft have already been cited as potential partners in the effort to manage the astronomical amounts of data this work would create. The NYT article cites an explicit desire to invest in nanoscience research, potentially as a new avenue for creating voltage indicators. Optics is also a huge component to this task, so advances in microscopy are a necessity. Furthermore, techniques currently in use tend to utilize animal-specific properties that make them not translatable to other species (such as the fact that zebrafish can be made to be transparent and we, currently, cannot). So if this really is to be a human brain activity map (as the government seems to suggest) a whole new level of non-invasive imaging techniques will need to be devised. Another potential solution to that problem, as the Neuron article suggests, requires investment in the development of synthetic biological markers that may not rely on imaging in order to record neural activity. In another vein, the article also makes a point of defending the notion that all data obtained should be made publicly available. This project might then have the added benefit of advancing the open access cause and spurring new technologies for public data sharing.
Overall, it is important to take a realistic view on what to expect from a project of this magnitude and ambition. It can be tempting, as it frequently is with studies of the brain especially, to overstate or romanticize the potential results and implications. On the whole this benefits no one. What’s important is finding the right level of realistic optimism that recognizes the importance of the work, even without attaching to it the more grandiose expectations. Furthermore, a safe bet can be made on the fact that if this project comes to fruition, the work itself will have some immediate tangential benefits (economic and technological) and produce unforeseeable ripples in many fields for years to come.
Recently, I was charged with giving a presentation to a group of high schoolers preparing for the Brain Bee on the topic of computational approaches to neuroscience. Of course, in order to reach my goal of informing and exciting these kids about the subject, I had to start with the very basic questions of ‘what’ and ‘why.’ It seems like this task should be simple enough for someone in the field. But what I’ve discovered–in the course of doing computational work and in trying to explain my work to others–is that neither answer is entirely straightforward. There is the general notion that computational neuroscience is an approach to studying the brain that uses mathematics and computational modeling. But as far what exactly falls under that umbrella and why it’s done, we are far from having a consensus. Ask anyone off the street and they’re probably unaware that computational neuroscience exists. Scientists and even other neuroscientists are more likely to have encountered it but don’t necessarily understand the motivation for it or see the benefits. And even the people doing computational work will come up with different definitions and claim different end goals.
So to add to that occasionally disharmonious chorus of voices, I’d like to present my own explanation of what computational neuroscience is and why we do it. And while the topic itself may be complicated and convoluted, my description, I hope, will not be. Basically, I want to stress that computational neuroscience is merely a continuation of the normal observation- and model-based approach to research that explains what so many other scientists do. It needn’t be more difficult to justify or explain than any other methodology. Its potential to be viewed as something qualitatively different comes from the complex and relatively abstract nature of the tools it uses. But the choice of those tools is necessitated simply by the complex and relatively abstract nature of what they’re being applied to, the brain. At its core, however, computational neuroscience follows the same basic steps common to any scientific practice: making observations, combining observations into a conceptual framework, and using that framework to explain or predict further observations.
That was, after all, the process used by two of the founding members of computational neuroscience, Hodgkin and Huxley. They used a large set of data about membrane voltage, ion concentrations, conductances, and currents associated with the squid giant axon (much of which they impressively collected themselves). They integrated the patterns that they found in this data into a model of a neural membrane, which they laid out as a set of coupled mathematical equations each representing different aspects of the membrane. Given the right parameters, the solutions to these equations matched what was seen experimentally. If a given amount of current injection made the squid giant axon spike, then you could put in the same amount of current as a parameter in the equations and you would see the value of the membrane potential respond the same way. Thus, this set of equations served (and still does serve) as framework for understanding and predicting a neuron’s response under different conditions. With some understanding of what each of the parameters in the equations is meant to represent physically, this model has great explanatory power (as defined here) and provides some intuition about what is really happening at the membrane. By providing a unified explanation for a set of observations, the Hodgkin-Huxley model does exactly what any good scientific theory should do.
It may seem, perhaps, that the the actual mathematical model is superfluous. If Hodgkin and Huxley knew enough to know how to build the model, and if knowledge of the what the model means has to be applied in order to understand its results, then what is the mathematical model contributing? Two things that math is great for: precision and the ability to handle complexity. If we wanted to, say, predict what happens when we throw a ball up in the air, we could use a very simple conceptual model that says the force of gravity will counteract the throwing force, causing the ball to go up, pause at its peak height, and come back down. So we could use this to predict that more force would allow the ball to evade gravity’s pull for longer. But how much longer? Without using previous experiments to quantify the force of gravity and formalize its effect in the form of an equation, we can’t know. So, building mathematical models allows for more precise predictions. Furthermore, what if we wanted to perform this experiment on a windy day, or put a jetpack on the ball, or see what happens in the presence of second planet’s gravitational pull, or all of the above? The more complicated a system is, and the more its component parts counteract each other, the less likely it is that simply “thinking through” a conceptual model will provide the correct results. This is especially true in the nervous system, where all the moving parts can interact with each other in frequently nonlinear ways, providing some unintuitive results. For example, the Hodgkin-Huxley model demonstrates a peculiar ability of some neurons: the post-inhibitory rebound spike. This is when a cell fires (counterintuitively) after the application of an inhibitory input. It occurs due to the reliance of the sodium channels on two different mechanisms for opening, and the fact that these mechanisms respond to voltage changes on a different timescale. This phenomenon would not be understandable without a model that had the appropriate complexity (multiple sodium channel gates) and precision (exact timescales for each). So, building models is not a fundamentally different approach to science; we do it every time we infer some kind of functional explanation for a process. However, formalizing our models in terms of mathematics allows us to see and understand more minute and complex processes.
Additionally, the act of building explicit models requires that we investigate which properties are worth modeling and in what level of detail. In this way, we discover what is crucial for a given phenomenon to a occur and what is not. In many regards, this can be considered a main goal of computational modeling. The Human Brain Project seeks to use its 1 billion Euro prize to model the human brain in the highest level of detail and complexity possible. But, as many detractors point out, having a complete model of the brain in equation form does little to decrease the mystery of it. The value of this simulation, I would say, then comes in seeing what parameters can be fudged, tweaked, or removed entirely and still allow the proper behavior to occur. Basically, we want to build it to learn how to break it. Furthermore, as with any hypothesis testing, the real opportunity comes when the predictions from this large-scale model don’t line up with reality. This lets us hunt for the crucial aspect that’s missing.
Computational neuroscience, however, is more than just modeling of neurons. But, in the same way that computational models are just an extension of the normal scientific practice of modeling, the rest of computational neuroscience is just an extension of other regular scientific practices as well. It is the nature of what we’re studying, however, that makes this not entirely obvious. Say you want to investigate the function of the liver. Knowing it has some role in the processing of toxins, it makes sense to measure toxin levels in the blood, presence of enzymes in the liver, etc when trying to understand how it works. But the brain is known to have a role in processing information. So we have to try, as best we can, to quantify and measure that. This leads to some abstract concepts about how much information the activity of a population of cells contains and how that information is being transferred between populations. The fact that we don’t even know exactly what feature of the neural activity contains this information does not make the process any simpler. But the basic notion of desiring to quantify an important aspect of your system of interest is in no way novel. And much of computational neuroscience is simply trying to do that.
So, the honest answer to the question of what computational neuroscience is is that it is the study of the brain. We do it because we want to know how the brain works, or doesn’t work. But, as a hugely complex system with a myriad of functions (some of which are ill- or undefined), the brain is not an easy study. If we want to make progress we need to choose our tools accordingly. So we end up with a variety of approaches that rely heavily on computations as a means of managing the complexity and measuring the function. But this does not necessarily mean that computational neuroscientists belong to a separate school of thought. The fact that we can use computers and computations to understand the brain does not mean that the brain works like a computer. We merely recognize the limitations inherent in studying the brain, and we are willing to take help wherever we can get it in order to work around them. In this way, computational approaches to neuroscience simply emerge as potential solutions to the very complicated problem of understanding the brain.
Kaplan, D. (2011). Explanation and description in computational neuroscience Synthese, 183 (3), 339-373 DOI: 10.1007/s11229-011-9970-0