Introduction to big data midterm exam solution

 
   

Need your ASSIGNMENT done? Use our paper writing service to score better and meet your deadlines.


  Order a Similar Paper    Order a Different Paper  

  

QUESTION 1

What are the three characteristics of Big Data, and what are the main considerations in processing Big Data?

  

QUESTION 2

 Explain the differences between BI and Data Science.

  

QUESTION 3

 Briefly describe each of the four classifications of Big Data structure types. (i.e. Structured to Unstructured)

  

QUESTION 4

List and briefly describe each of the phases in the Data Analytics Lifecycle.

  

QUESTION 5

In which phase would the team expect to invest most of the project time? Why? Where would the team expect to spend the least time?

  

QUESTION 6

Which R command would create a scatterplot for the dataframe “df”, assuming df contains values for x and y?

  

QUESTION 7

What is a rug plot used for in a density plot?

  

QUESTION 8

What is a type I error? What is a type II error? Is one always more serious than the other? Why?

  

QUESTION 9

Why do we consider K-means clustering as a unsupervised machine learning algorithm?

  

QUESTION 10

Detail the four steps in the K-means clustering algorithm.

  

QUESTION 11

List three popular use cases of the Association Rules mining algorithms.

  

QUESTION 12

Define Support and Confidence

  

QUESTION 13

 How do you use a “hold-out” dataset to evaluate the effectiveness of the rules generated?

  

QUESTION 14

List two use cases of linear regression models.

  

QUESTION 15

 Compare and contrast linear and logistic regression methods.