Python Toolkit for Accessing RCSB.org Search and Data APIs
06/24
Harness the powerful capabilities of RCSB.org Search and Data API services through the new all-in-one Python toolkit, rcsb-api. This package provides users with streamlined access to the full collection of curated data from the PDB archive together with the rich set of annotations from external resources integrated into RCSB.org, all through a Pythonic interface.
New to APIs? Visit PDB-101 for an introduction.
The Python package can be installed from PyPI or by downloading the source code on GitHub, which will serve as the central hub for all future development, bug fixes, and project discussions.
The package is organized into distinct modules for each API service—a Search API module and a Data API module—which represent the two main APIs that power RCSB.org. The Search API retrieves PDB IDs that match a given query, while the Data API retrieves data for a given set of PDB IDs. Key features of each module include:
Search API module
- Perform all search types available through the RCSB.org Advanced Search query builder (e.g., full-text, attribute-based, sequence and structure similarity, sequence and structure motif, chemical similarity)
- Use simple Boolean logic to intuitively construct complex or nested queries
- Upload custom structure files for structure similarity searches
- Include computed structure models (CSMs) in search results
- Use faceted queries to aggregate results and gain statistical insights
Data API module
- Retrieve any subset of metadata, features, and/or annotations for a given list of PDB IDs (e.g., experimental method details, structural annotations, binding sites, etc.)
- Easily fetch data for all structures across the archive
- Simplified GraphQL query construction using a Python syntax
Extensive documentation with examples is available, along with a recorded webinar and training materials that introduce RCSB PDB APIs, the rcsb-api Python package, and a series of hands-on notebooks demonstrating usage of the toolkit.
The rcsb-api package—which builds upon previously described work—was developed by Ivana Truong, B.S. (University of Minnesota) and Habiba Morsy, B.S. (Kean University) under the direction of Dennis Piehl, Ph.D. and Brinda Vallat, Ph.D. (RCSB PDB) as part of the Rutgers RISE (Research Intensive Summer Experience and published in the Journal of Molecular Biology:
rcsb-api: Python Toolkit for Streamlining Access to RCSB Protein Data Bank APIs
Dennis W. Piehl, Brinda Vallat, Ivana Truong, Habiba Morsy, Rusham Bhatt, Santiago Blaumann, Pratyoy Biswas, Yana Rose, Sebastian Bittrich, Jose M. Duarte, Joan Segura, Chunxiao Bi, Douglas Myers-Turnbull, Brian P. Hudson, Christine Zardecki, Stephen K. Burley
(2025) Journal of Molecular Biology 437: 168970 doi: 10.1016/j.jmb.2025.168970

Past news and events have been reported at the RCSB PDB website and past Newsletters.