In: Computer Science
Data curation and management have different characteristics depending on the nature of the data and the nature of the institution involved.
three areas which may show significant differences are government, research and the corporate environment.
identify two job descriptions, listing required roles and responsibilities, within each of these three areas (total six job descriptions), and then to compare and contrast their approaches.
1. Data curation and Management in government sector:-
i. NASA's Climate data initiative (CDI) and NASA's Earth science data active archive centers (DAACs)
NASA's CDI project is a systematic effort to manually curate and share openly available climate data from various federal agencies. The steps which NASA use in curation are searching , selecting and synthesizing Earth science data/metadata and information from across disciplines and repositories into a single , cohesive and useful compendium.
ii. Chief data officer in government department of Biological sciences
They maintain database of large biological molecules such as protein and nucleic acids. Sometimes in government agencies the data is submitted by chemists and biologists all around the world. Most major scientific journals, and some funding agencies, now require scientists to submit their structure data to the PDB (Protein data bank). Many other databases use protein structures deposited in the PDB.
2. Data curation and Management in Research
i. Research data curation specialist for cancer
The Research Data Curation Specialist will support the clinical research program in the areas of data collection, computing, and database organization. Duties include the examination, synthesis, and evaluation of medical records; the abstraction and recording of pertinent medical information; and the monitoring of patient status. The Research Data Curation Specialist will be responsible for the collection, management, and quality assurance review of patient clinical data.
ii. Data Research and acceleration analyst
They are a team of quantitative researchers, data scientists, engineers, and business colleagues to translate data into actionable insights and to accelerate research pipeline. Various responsibilities are -
3. Data curation and management in corporate environment
i. Data Engineer
The data engineer should have an expertise in Data Management, Data Load Acquisition and Transformation, Applying data quality business rules,Understanding business requirements and complex strategies to implement data warehouse and campaign lists processes for customers. The engineer should develop specific processes with data curation team to support data curation for different use cases and required views of the data, model scoring, validation, reporting, list production etc. to support implementation and delivery of any type of analytic solution.
ii. Data modeler architecture
This position is that of a data modeler analyst. They not only work on data models but drive a data strategy governance around new and upcoming projects reviewing curation and distribution strategy and help implementing data structures in a multitude of environments that include but not limited to Teradata Hadoop the Cloud in near future.
Comparing various approaches in data curation and management:-
In government sector data curation helps in planning and implementing various policies and researchers identify several issues in data curation: storage capacity, data sharing, data description and organization, confidentiality, intellectual property rights, complexity, and develop new approaches to managing data. While in corporate sector data curation can be seen as a strategy to maintain and deliver best projects for clients all across the world.