Negative Control
Another necessary test of our algorithm is to set up some biological negative controls. Just like proteins with known structure are used as controls in structure prediction, gene pairs without any functional relationships are used as negative controls.
However, these negative controls are fairly hard to select because the number of gene pairs with known functional relationships are too small. For instance, we found 1406 pairs of genes that are very tightly clustered together by our algorithm. But among these pairs there are only 10 gene pairs have same known function, which means that lots of gene pairs with same function have not been discovered. And this is one of our aims to develop this algorithm, i.e. we want to predict new gene pairs that may share the same function and give a vague direction for the research. So the gene pairs without known functional relationships maybe share the same function, but we have not discovered their proper relations yet.
Gene pairs used as negative controls are selected according to the criteria below:
a) They are not in the same function class based on MIPS;
b) They have no interactions between their protein products based on two-hybrid data;
c) Their protein products are not in the same cellular compartment;
d) There are no documented relationships between them;
Again, even the gene pairs satisfying these criteria may share same function. But the possibility that they do not is much higher.