The Open Protein Structure Annotation Network
PDB Keyword


    Table of contents
    No headers

    From our abstract submitted to the Automated Function prediction (AFP2006) meeting:

    Structural Genomics (SG) efforts, in particular, those of the National Institutes of Health (NIH)- sponsored Protein Structure Initiative (PSI), which includes the Joint Center for Structural Genomics (JCSG), have impacted structural biology in many ways, some of which were not completely anticipated. From the time of their inception in the year 2000, the PSI centers have determined high-resolution structures of more than 1200 proteins, with many of them representing novel, previously uncharacterized protein families [1]. In this category, SG centers already outpace the rest of the structural biology field [1,2]. Another major impact of SG efforts came from the rapid progress in developing automated procedures for all steps of protein structure determination, which, increasingly, are being adopted by mainstream structural biology. Both these developments were expected and welcomed by the scientific community. However, the protein architectural information flowing from SG has not been assimilated into mainstream research as rapidly and as widely as that generated by traditional structural biology. We believe the reason for this unanticipated situation is that, unlike traditional structural biology, structure determination at SG centers is inevitably not always - nor even routinely - accompanied by a local stream of connected, synergistic biochemical and biological research. Consequently, the vast majority of protein structures determined by SG centers lack these complementary details and are not described in high impact, peer-reviewed manuscripts, the principal way by which scientists communicate. Instead, the end result of the work of a SG center is usually a set of coordinates deposited in the PDB, information that is not readily assimilated by a typical biologist and opportunities are likely often missed since the scientific application is not recognized. As a result, data from structural genomics is only very slowly absorbed into the wider research stream, largely as correlated experimental data arises.

    The goal of our project is to develop “The Open Protein Structure Annotation Network” (TOPSAN; https://www.topsan.org), a radically novel way to collect, share and distribute information about protein three-dimensional structures, and to advance it towards knowledge about functions and roles of these proteins in their respective organisms. TOPSAN will serve as a portal for the scientific community to learn about protein structures solved by SG centers, and also to contribute their expertise in annotating protein function. The premise of the TOPSAN project is that, no matter how much any individual knows about a particular protein, there are other members of the scientific community who know more about certain aspects of the same protein, and that the collective analyses from experts will be far more informative than any local group, let alone individual, could contribute. We believe that, if the members of the biological community are given the opportunity, authorship incentives, and an easy way to contribute their knowledge to the structure annotation, they would do so. Therefore, borrowing elements from successful, distributed, collaborative projects, such as Wikipedia (the free encyclopedia anyone can edit) [3] and from other open source software development projects, TOPSAN will be a broad, collaborative effort to annotate protein structures, initially, those determined at the JCSG. We believe that the annotation of proteins solved by structural genomics consortia offers a unique opportunity to challenge the extant paradigm of how biological data is collected and distributed, and to connect structural genomics and structural biology to the entire biological research community. TOPSAN is designed to be scalable, modular and extensible. Furthermore, it is intended to be immediately useful in a simplistic way and will accommodate incremental improvements to functionality as usage becomes more sophisticated. Our annotation pages will offer the end user a combination of automatically generated as well as expert-curated annotations of protein structures. We will use available technology to increase the speed and granularity of the exchange of scientific ideas, and use incentive mechanisms that will encourage collaborative participation [4].

    Each of the individual PSI centers currently publishes brief, peer-reviewed scientific documents that describe and elucidate (as far as possible) the biologically relevant features of protein three-dimensional structures (Structure Notes) for a small fraction of the structures they solve, while others remain noted only as “uncharacterized hypothetical proteins, to be published” in the PDB. Our goal is to reach out to the general biological community to participate in structure/function annotations of these proteins, with volunteers from the community providing expertise, oversight, validation and management of annotations. The resulting structural biology knowledge repository would be a radical experiment in new ways of collecting, sharing and distributing research information, and will explore ways to modify the traditional, rigid and controlled structure of a research project to accommodate challenges and possibilities brought about by the new, technology-driven, high-throughput science and web-based computer technologies. Rapid advancements in science during the last century have been possible primarily due to the timely communication and sharing of scientific results [5]. Traditional methods of disseminating scientific knowledge, i.e., the publication of manuscripts in peer-reviewed scientific journals and scientific conferences, impede the rate at which scientific information generated at high-throughput centers can be shared and exchanged. In recent years, the internet has proven to be a valuable medium by which information can be exchanged at a rapid pace, and we believe that the TOPSAN project will significantly influence the process of scientific communication.


    [1] Chandonia JM, and Brenner SE. (2006). The impact of structural genomics: expectations and outcomes. Science 311:347-351.
    [2] Sadreyev RI and Grishin NV. (2006). Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds.
    BMC Struct Biol. :6:6.
    [3] Wikipedia – The free encyclopedia anyone can edit (http://en.wikipedia.org/wiki/Wikipedia)
    [4] Robert Axelrod (1985). The Evolution of Cooperation. Basic Books
    [5] Elizabeth L. Eisenstein (1980). The Printing Press as an Agent of Change. Cambridge University Press




    No references found.

    Tag page
    • No tags

    Files (1)

    FileSizeDateAttached by 
    TOPSAN poster presented at the 2009 PSI Workshop - Enabling Technologies for Structural Biology.
    44.79 MB19:12, 6 Apr 2009krishnaActions
    You must login to post a comment.
    All content on this site is licensed under a Creative Commons Attribution 3.0 License
    Powered by MindTouch