
In: Computer Science

Background: As noted by Kirk (2016), working with data is one of the four stages of...

Background: As noted by Kirk (2016), working with data is one of the four stages of the visualization workflow. According to Kirk (2016), “A dataset is a collection of data values upon which a visualization is based.” In this course, we will be using datasets that have already been collected for us. Data can be collected by various collection techniques.

Reference: Kirk, Andy. Data Visualisation: A Handbook for Data Driven Design (p. 50). SAGE Publications.

Assignment: Summarize 3 data collection techniques (Interviews, Surveys, Observations, Focus Groups, etc.). Compare and contrast the 3 data collection techniques you selected. Lastly, what collection techniques do you prefer and why?

Your research paper should be at least 3 pages (800 words), double-spaced, have at least 4 APA references, and typed in an easy-to-read font in MS Word (other word processors are fine to use but save it in MS Word format).

Please provide the accurate answers for this question.


Expert Solution

Abstract: Selection of a dataset collection technique is purely based on the type of research that one needs to execute. The various dataset collection techniques include: Interviews, Questionnaires and Surveys, Experiments, Documents and Records, Focus Groups etc.

Interviews: One of the most sophisticated way of collecting the dataset for research includes interaction with the people in person or telephonically. This method helps in viewing the same event from different perspective and viewpoints. The data thus collected might not always be relevant but it gives the researcher a qualitative measure to resolve the analysis of data by averaging the distributive data. The major concern with this data collection technique is the legitimacy of the source. Every eye sees differently, thus making every observation different and every experience different. [3] Nguyen, et al., propose a great way of collecting the dataset with their research, “Hire me: Computational Inference of Hirability in Employment Interviews Based on Nonverbal Behavior”. As they summarize, they tried to develop an algorithm that detects the non-verbal behavior. It should be comprehended that the interviewers that try to extract the important parameters from a face to face interview can be viewed as a computational problem here. The type of face, expression, clothes, spaces in their sentences while speaking, everything is an important dataset while coming to collection of datasets through interviews.

Documents and Records: Analysis of the recorded data like, Database of a company, minutes of meetings, attendance records, Quality Assurance sheets. More advanced case can be, Natural Language Processing based Python queries that records the type of tweets, Facebook posts, YouTube comments and analyze the nature of input and data. This is a very strict and rigorous way of putting together the data set as it requires a lot of legwork and midnight oil. The dataset thus collected is very useful in numerous applications like Intrusion detection system based on Artificial Neural Networks [1]. The validity of the dataset is very important as the Machine Learning Models are entirely based on the dataset they are provided with and will eventually learn from this dataset. The quality of output, these models give, depends entirely on the input vector which is made of datasets. The dataset thus contained needs to be clustered because these are in its raw and unlabeled form [2]. Choosing a good webpage in the heap of webpages becomes extremely important when the number of webpages are unfathomably large.

Ethnographies: This technique is not restricted to using a single method for collection of data. It includes Interviews, Experiments, Surveys etc. Revolves around the analysis of a single phenomenon and tries to decode the and put out an observation based on the dataset thus obtained. Like trying to analyze the rain pattern and its effects on Rice based on different geographical areas of agriculture, the yield of the crop, the amount of insecticides and pesticides etc. Researcher than can conduct interview with the people to know about their experiences and observations, hence not restricting to just our own findings but also unionizing them with the interviewee’s findings. Extensive fieldwork is the primary dataset collection technique that the ethnography contains. The dataset thus contained has more qualitative and quantitative parameters to visualize. Ethnography can be used to investigate user experience [4]. For some applications and insights, it becomes very crucial to record and investigate how a certain kind of technology rests with the humans. Ethnography is the best methodology to investigate such research which provides a great in-depth evaluation to the design process and the design of the technology thus used.

Conclusion: Even though the type of dataset collection is based on the type of research we are conducting, it is beneficial to conduct Ethnographical methodology to contain and collect the dataset as it gives us a dynamic variety of observations which is not restricted to only interviews, experiments and surveys alone. Moreover, Ethnographical studies shows us a single phenomenon that focuses on single type of quality and quantity and hence it gives us great visualization if the data thus contained is put onto spreadsheet analysis or MATLAB graph plots. This is possible due to Ethnography being a holistic approach i.e. flexible and a non-constant system. This approach gives advantage as the observation is conducted in the pure natural environment and no artificial factors are introduced that can muddle the original dataset.


[3]Nguyen, L. S., Frauendorfer, D., & Mast, M. S. (2014). Hire me: Computational Inference of Hirability in Employment Interviews Based on Nonverbal Behavior. IEEE Transactions on Multimedia, 1018-1031.

[4]Wijaya, S. W., & Nurmalia. (2018). Investigating User Experience with Digital Ethnography Approach: Principles and Guidelines. 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE).

[1]Cao, V. L., Hoang, V. T., & Nguyen, Q. U. (2013). A scheme for building a dataset for intrusion detection systems. 2013 Third World Congress on Information and Communication Technologies (WICT 2013).

[2] Ahmed M., Bansal P. (2013) Clustering Technique on Search Engine Dataset Using Data Mining Tool. 2013 Third International Conference on Advanced Computing and Communication Technologies (ACC)).

(Not part of answer : This was all written in MS word format, but the text editor altered the spacing.)

Related Solutions

One way the statistical investigative cycle is taught is in four stages: Formula questions Collect data...
One way the statistical investigative cycle is taught is in four stages: Formula questions Collect data Analyze data Interpret Results Find an example that follows this process and utilizes what we have learned about collecting data, displaying the data graphically, and making conclusions. Indicate each stage in the example.
Most upcoming entrepreneurs will not have the 30 year corporate background working with one company. Do...
Most upcoming entrepreneurs will not have the 30 year corporate background working with one company. Do you believe this will make a difference in their success? Explain.
briefly explain the four stages in a brainstorming
briefly explain the four stages in a brainstorming
Explain the four basic stages of negotiation.
Explain the four basic stages of negotiation.
please answer this question. You are working as an auditor in one of the big four...
please answer this question. You are working as an auditor in one of the big four audit firms. The firm is required to conduct external audits of their clients and give audit opinions based on the audit. You have been assigned to conduct an audit one week from now. The client is a local bank. Draw the timeline and steps of the audit process. Emphasize on the documentation of the audit engagement.
A process has four stages (A, B, C, and D). The stages have output rates of...
A process has four stages (A, B, C, and D). The stages have output rates of 5 units/hr, 12 units/hr, 10 units/hr, and 7 units/hr, respectively. What is the output rate of this process? a. 12 units/hr b. 10 units/hr c. 7 units/hr d. 5 units/hr
2. One in four Americans (25%) surveyed in a poll in 2016 said that they didn’t...
2. One in four Americans (25%) surveyed in a poll in 2016 said that they didn’t read a single book within the last 12 months. When Pew Research surveyed 1,520 adults living in all 50 U.S. states and the District of Columbia in 2016, they learned that the number of respondents that didn’t read a book within the last year did not budge from 2015 figures. A recent sample of 120 people indicated that 41 of them didn’t read a...
List and identify four key stages in the development of procurement
List and identify four key stages in the development of procurement
Explain the purpose of mitosis and describe the four stages of this process.
Explain the purpose of mitosis and describe the four stages of this process.
describe the four stages of community health nursing development
describe the four stages of community health nursing development