SeqQuests Findings

Overlooked protein homologies from twilight zone Smith-Waterman analysis

This site presents protein sequence relationships that appear to have been missed by standard annotation pipelines. They were identified through an all-on-all Smith-Waterman comparison of UniProt Swiss-Prot (~570,000 sequences), filtered to remove known relationships. The tree-building method produces not an evolutionary tree, but rather picks out homologies that connect clusters. These weaker links are normally drowned out by the very strong connections within the clusters.

The focus is on "twilight zone" similarities (15–30% identity, scores 140–300) where genuine homology is often obscured by sequence divergence or compositional bias. Standard tools tend to dismiss these as noise.

Browse the findings →

What's here

6,579 candidate protein pairs after automated filtering, of which approximately 20 (distilled findings) have been identified as warranting annotation updates or representing novel biological insights. Examples include:

Some of the other 6,579 candidates may be of interest too, though many of them represent similarities that are more obvious from the names or from other annotations, but just not in a way that the simple filters for 'already known' could pick up.

Sample alignment

P85828-E2ADG2 s(654) Length: 314/205 P85828: Prohormone-3; Apis mellifera (Honeybee). E2ADG2: ITG-like peptide {ECO:0000303|PubMed:25641051}; Camponotus floridanus (Florida carpenter ant). 89 MYTCVALTVVALVSTMHFGVEAWGGLFNRFSPEMLSNLGYGSHGDHISKSGLYQRPLSTSYGYSYDSLEE |....|.|.|....|...|||||||||||||||||||||||.||......||.|......||......|| 1 MRVYAAITLVLVANTAYIGVEAWGGLFNRFSPEMLSNLGYGGHGSYMNRPGLLQEGYDGIYGEGAEPTEE 159 VIPCYERKCTLNEHCCPGSICMNVDGDVGHCVFELGQKQGELCRNDNDCETGLMCAEVAGSETRSCQVPI |||||||..|.||||||||||..|..|.||...|..||||||.|.||||||||||..| . 71 --PCYERKCMYNDHCCPGSICMNFNGVTGTCVSDFGMTQGELCRRDSDCETGLMCAEMSG------H--- 229 TSNKLYNEECNVSGECDISRGLCCQLQRRHRQTPRKVCSYFKDPLVCIGPVATDQIKSIVQYTSGEKRIT |||..|.||||||||||||||||||.|||||||||||||||||||||||||..|||||||||| 130 -------EECAMSSECDISRGLCCQLQRRHRQAPRKVCSYFKDPLVCIGPVATDQIKSVIQYTSGEKRIT 299 GQGNRIFKR |||||.||| 193 GQGNRLFKR

The region annotated as "transmembrane" in P85828 (positions 90–112) aligns perfectly with the signal peptide of the ant ortholog - suggesting an annotation update, and that P85828 is secreted, not membrane-bound.