Skip to content
MSK Logo

Data Catalog

  • Help
  • Email
  • Remote Access
  • Home (current)
  • About
  • Contact

Search Tips

  • Default Boolean Operator: Terms in a search string will be connected by the AND operator, EXAMPLE: searching genomic melanoma will produce the same results as genomic AND melanoma.
  • AND, OR, NOT: Combine terms in a search with the boolean operators AND (to add specificity), OR (to broaden), or NOT (to eliminate a term). Operators must be capitalized (all other search terms are not case-sensitive).
  • Wild Card: Add wild card symbols * (open-ended) and ? (single character) to truncate terms. EXAMPLE: searching gen* shows results for 'gene', 'genetic', 'genomic, etc. Searching gene? returns results for 'gene' or 'genes' but not for 'genome'. Phrase Searching: Enclose search in quotation marks "keyword or search term" to search for an exact phrase. EXAMPLE: "transcription factors" returns results for records where transcription is immediately followed by the word factors.
  • Diacritics: Words containing diacritics translate those characters to non-diacritic form. EXAMPLE: massague returns results for Massagué.
  • Filter by: Located on the homepage (left panel), use filters to search for records or narrow the search. From search results page, filters can be removed by clicking on the term at the top of the page.
About the Data Catalog
Meet the Team

About the Data Catalog

Welcome to the MSK Data Catalog, a searchable and browsable online collection of records describing the contents of datasets and providing access instructions for those wishing to explore the data for their own research. As a user, the data catalog is designed to help you identify data resources that may inform your project or research question. Some data resources may be immediately available to you from the catalog. Others may require you to complete a data use application form or contact the data set owner to explore your options.

The MSK data catalog is a digital way-finder designed to enhance discoverability and maximize the usefulness of datasets through rich metadata: description, keywords, data formats, instrumentation or software utilized in the creation of the data, and access information. It also connects researchers to data authors and experts to promote greater exposure and reuse of data while providing appropriate protections for authors. The Data Catalog is NOT a data repository to store data, but rather a portal to datasets stored in internal and external repositories. Our current inclusion criteria for this catalog are:

  • Internal datasets created by MSK researchers and associated with publication(s),
  • Internal datasets created by MSK researchers but not yet published or with no associated publication(s),
  • External datasets referenced and used by MSK researchers in publication(s).

If you are interested in submitting a dataset to the MSK Data Catalog, would like to suggest a data set for inclusion, or are willing to serve as a local expert, please contact us.

Highlights of the MSK Data Catalog:

  • Aggregate access to MSK datasets from one location,
  • Identify access points, experts, or owners of MSK datasets,
  • Help MSK researchers locate and understand datasets generated at external organizations,
  • Facilitate both internal and external research collaborations,
  • Increase the visibility of research data generated by MSK researchers and support data re-use,
  • Enhance Synapse records with data availability information.

While the cataloged datasets represent MSK data, they do not include all publicly available, licensed, or private external datasets. If you would like assistance in conducting a more comprehensive search of all datasets accessible through the library, please contact us.

The code used to create the MSK Data Catalog is open source and available via GitHub. Documentation and further information is available via OSF. If you would like to create a similar catalog, you can learn more by visiting the Data Discovery Collaboration.

Meet the Team

Anthony

Anthony Dellureficio

Data Catalog Coordinator

Anthony is the Associate Librarian for Data Management Services at the MSK Library. Drawing on a background in library technology, he works with researchers to provide systems solutions and offer a wide range of data management services at MSK. His academic areas of interest include the history of science, technology, and medicine and classical genetics.

Eric

Eric Willoughby

Data Catalog Developer

Eric is the Programmer/Analyst at the MSK Library, where he works with library professionals to develop applications which supports the diverse information needs of MSK researchers, staff, and patients.

Klara

Klara Pokrzywa

Data Catalog Metadata Librarian

Klara is the Metadata Librarian for Data Management Services at the MSK Library. Klara works to support data discovery and metadata interoperabiity. She also creates and updates catalog records for new and existing MSK datasets.

Connect

  • Contact Us
  • Info for Patients
  • MSK Library Blog
  • Follow Us On:
    Instagram logo image and link Bluesky logo image and link

About the Library

  • Staff Directory
  • Library FAQs
  • Using the Library

Research and Education

  • Sloan Kettering Institute
  • Gerstner Sloan Kettering Graduate School
  • Graduate Medical Education
Memorial Sloan Kettering Cancer Center
  • Legal Disclaimer
  • Accessibility Statement

© 2025 Memorial Sloan Kettering Cancer Center