Skip to Main Content
Albert S. Cook Library


Sources for finding data

Getting Started

Welcome to the Research Guide of Data! 

We are working on providing the Research Data Services covering every phase of the Data Lifecycle, from Plan, Collect, Assure, Describe, Preserve, Discover to Integrate and Analyze,( both for the research projects and the courses.

The phases of Data Lifecycle can be found in the section of Research Data Services, in which some resources of Data Analysis/Visualization and Data Repositiory/Sharing are also provided. The resources of datasets are still in the section of Data Repository/Sharing.


If you are looking for some resources about Data Analysis/Visualization, you may want to take a look at here.

To learn more about the research data services provided by the Cook Library, please contact Songyao Chen (, the Data Science Librarian.

Please refer to below for some of the quick links to the section you need: (you can also find all the section in the left bar)

Data Management Plan:, Funder Requirements, etc.

Data Repository/Datasets: ScholarWorks, Datasets of different disciplines, Dataverse, ICPSR,, OSF, Kaggle, ORCID, etc.

Data Visualization/AnalysisData Analysis phases, Data Visualization/Analysis Tools, Mapping tools, etc.



Updates on DMPTool Support for the NIH DMSP Requirements 

Yesterday, released the Winter 2023 Newsletter in which the latest updates of the support for the new NIH DMSP requirements are announced. 

***Today, we removed the older NIH DMSP templates, and all newly created plans will now be directed to the new NIH template. Previously created plans are still accessible; this only affects new plans.***

The DMPTool NIH Template Working Group is focused on supporting the upcoming NIH requirements for data sharing. This hard-working group, chaired by Nina Exner of Virginia Commonwealth University, has developed several new resources for the community."

Learn more here.

NIH new policy notice:

On October 29, 2020, NIH, the National Institutes of Health (NIH) is issuing this final NIH Policy for Data Management and Sharing (DMS Policy) to promote the management and sharing of scientific data generated from NIH-funded or conducted research. This Policy establishes the requirements of submission of Data Management and Sharing Plans (hereinafter Plans) and compliance with NIH Institute, Center, or Office (ICO)-approved Plans. It also emphasizes the importance of good data management practices and establishes the expectation for maximizing the appropriate sharing of scientific data generated from NIH-funded or conducted research, with justified limitations or exceptions. This Policy applies to research funded or conducted by NIH that results in the generation of scientific data.

The official notice can be find here. The new policy will be effective on January 25, 2023.

Research Data Management

Research Data Management (RDM) is a broad concept that includes processes undertaken to create organized, documented, accessible, and reusable quality research data. The role of the librarian is to support researchers through the research data lifecycle. 

The processes involved in RDM are more complex than simply backing up data on a thumb drive and ensuring that sensitive data is kept secure.  Managing data includes using file naming conventions, organizing files, creating metadata, controlling access to data, backing up data, citing data, and more.  There are checklists online which point to the considerations and processes in RDM .


"Keeping Up With… Research Data Management", American Library Association, April 17, 2018. (Accessed August 12, 2022)

Document ID: 0c271f0c-e0ba-4306-8278-cdcff353a83c


Why Data Management is important and what could happened when there is no data management has ever been applied in your research project? The video below will tell you!

Data Citation

Citing Data

It's important to cite data right. Proper citation ensures that research data can be:

  • discovered
  • reused
  • replicated for verification
  • credited for recognition
  • tracked to measure usage and impact

How to Cite Data

Citing data is straightforward. Each citation must include the basic elements that allow a unique dataset to be identified over time:

  • Author
  • Title
  • Distributor
  • Date
  • Version
  • Persistent identifier (such as the Digital Object Identifier, Uniform Resource Name URN, or Handle System)