Our group's involvement in this work was principally related to the analysis of gene expression data, the identification and prioritization of differentially expressed genes and (at the revision stage) the comparison of our data with publicly available gene expression profiles in order to validate the main finding of the paper, that PML knockdown cells have a profile that is similar to differentiated epiblast-like cells. Antonis Klonizakis, an undergraduate student from our group, had to go through the mill of finding, analyzing and comparing public gene expression datasets to show that PML knockdown cells resemble a state of primed differentiation, lying intermediate between ES and epiblast cells.
First of all, there is biological variability that you can't do away with. Stem cells come in different types, "flavours" and various cell lines that are as different from each other as they are compared to other cell types. Even then, after having located profiles of the same kind of stem cells we were dealing with, we were faced with problems that had to do with the standardization of data processing. Many (most) papers failed to adequately report the data processing steps and thus we were unable to reproduce the results they were reporting by analyzing the raw data ourselves. This may sound like an excuse for irreproducibility but in my view it is the main reason behind many of the irreproducibiliy issues in research, that recently have even caught the attention of media such as the Wall Street Journal (as if they didn't have enough Wall Street-related problems to deal with already). Lack of standardization is a major issue for two reasons: first, it makes it very likely that the results that you come up with by repeating a series of complicated processing steps (that are not thoroughly reported in the original paper) do not match the ones reported and second, it makes the whole idea of comparing data so discouraging that in many cases it is preferable to repeat the whole experiment yourself. To the non-biomedically-oriented readers this may sound like an incredible waste of time, money and human resources but it is so commonplace that it was the original suggestion by the reviewers of Christiana's paper. What they asked for was to conduct gene expression analyses in other ESC lines, for which data were surely available already.
Going beyond standardization the situation becomes even worse when one considers the availability of data. In their search for datasets to compare with, Antonis and Christiana came up with papers such as this one, for which the data were not only not standardized but not even reported. That is right! You skim the paper for GEO or ArrayExpress links and find none, you read it carefully, you go through the (rudimentary) supplementary data and you still find nothing, then you (in this case I) write to both the corresponding author and the editors of a respectable journal and you are still waiting for an answer three months later. Such situations may (and should) be unheard of in other contexts but are somehow acceptable in the highly competitive field of biomedicine. To people like us, though, that are hoping that the accumulation, cross-comparison and validation of data may be a way to acquire new knowledge all this is particularly disturbing. Not least because it makes our work harder, but because it also makes everyone else's less significant.