0 like 0 dislike
61 views
in Computer Science by (1.0m points)
Which activities are involved in data analysis?

1 Answer

0 like 0 dislike
by (1.0m points)
5 steps involved in data analysis process:

These 5 steps involved in data analysis process are mentioned in a paper named as ‘Enterprise Data Analysis and Visualization: An Interview Study’ by Sean Kandel and companions. It is an interview study conducted by the authors of the paper where they interviewed 35 data scientists from 25 organizations. They mention it as follows:

“To better understand the enterprise analysts’ ecosystem, we conducted semi-structured interviews with 35 data analysts from 25 organizations across a variety of sectors, including healthcare, retail, marketing and finance. Based on our interview data, we characterize the process of industrial data analysis and document how organizational features of an enterprise impact it”

The 5 data analysis steps mentioned in the paper are as follows:

Discovery

Wrangling

Profiling

Modeling

Reporting

Discovery:

First step in data analysis process is to discover / collect data for analysis. Data can be gathered from multiple sources like database tables, log files, spreadsheets or from an online source. The challenges involved in this phase are finding relevant data and interpreting certain fields in the database tables etc.

Wrangling:

Once the data is collected, the next step is wrangling or cleaning the available data. Data manipulation and integration of data obtained from multiple sources are the main tasks performed in this phase. Some of the issues data scientists face in this phase are processing semi-structured data e.g data received from log files, integration of data obtained from diverse sources etc.

Profiling:

Before using the available data in any analysis, we need to make sure that there are no issues in our data. Data may have quality issues like missing, erroneous or extreme values which may affect the analysis results. In this phase data analyst make sure that there are no anomalies in the data that we are going to use in our analysis.

Modeling:

In this phase data analyst decides the features, scale and statistical method to be used for the analysis process. Some of the issues faced during this phase are relevant features selection and data size scale issues with data analyzing tools etc.

Reporting:

In this final step, insights gained from the analysis process are reported. Communicating the assumptions involved in analysis process effectively and static reports (i.e no interactive method to check the results) are some of the points need to be considered in this phase.

Conclusion:

In data analysis process 5 phases are involved namely discovery, wrangling, profiling, modeling and reporting. Some data analysis may exclude some of the steps depending upon the nature of the data analysis. Some of the issues faced by data analyst’s during each phase are also discussed in this post.

Related questions

0 like 0 dislike
1 answer 32 views
0 like 0 dislike
1 answer 47 views
0 like 0 dislike
1 answer 141 views
0 like 0 dislike
1 answer 47 views
0 like 0 dislike
1 answer 52 views
0 like 0 dislike
1 answer 137 views
0 like 0 dislike
1 answer 42 views
0 like 0 dislike
1 answer 37 views
0 like 0 dislike
1 answer 56 views
asked Feb 14, 2019 in Computer Science by danish (1.0m points)
0 like 0 dislike
1 answer 38 views
Welcome to Free Homework Help, where you can ask questions and receive answers from other members of the community. Anybody can ask a question. Anybody can answer. The best answers are voted up and rise to the top. Join them; it only takes a minute: School, College, University, Academy Free Homework Help

19.4k questions

18.3k answers

8.7k comments

6.3k users

Free Hit Counters
...