Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

UNDERSTANDING BIG DATA AND DATA ANALYTICS, Lecture notes of Business Fundamentals

Course Subject: Business Analytics Lesson #3: UNDERSTANDING BIG DATA AND DATA ANALYTICS > Business Analytics, BI, Big Data, Data Mining-What’s the Difference? > Businesses Need Support for Decision-Making > Characteristics of Data for Good Decision-Making > What is Big Data? > Types of Data (Ratio Data, Nominal Data, Ordinal Data, etc. > BIG DATA CHARACTERISTICS

Typology: Lecture notes

2024/2025

Available from 06/06/2025

ughlexisss
ughlexisss 🇵🇭

5 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
bUsIness AnaLYtICs
Lesson #3
UNDERSTANDING BIG DATA AND DATA ANALYTICS
Business Analytics, BI, Big Data, Data Mining-Whats the
Difference?
Business Analytics tools to explore past data to
gain insights into future business decisions.
BI tools and techniques to turn data into
meaningful information.
Big Data data sets that are so large or complex
that traditional data processing applications are
inadequate.
Data Mining tools for discovering patterns in large
data sets.
Businesses Need Support for Decision-Making
Uncertain economics.
Rapidly changing environments
Global competition
Demanding customers
Taking advantage of information acquired by
companies is a Critical Source Factor.
Characteristics of Data for Good Decision-Making
---Better Quality Data-Characteristics
ACCURACY Accurate enough for intended purposes:
Balance with use, cost, effort, timeliness
Capture close to point of activity
Make accuracy compromises clear
VALIDITY Compliance with requirements:
Application of definitions
Consistency over time
Consistency with others
RELIABILITY Collection processes consistent:
over time
for multiple collection points
between collection systems
TIMELINESS To influence decisions:
Capture quickly after the event
Available quickly enough
Available frequently enough
RELEVANCE Data relevant to intended purpose:
Periodic review of requirements
Quality assurance and feedback process
Use carefully for other purposes
COMPLETENESS Monitor quality to match data needs:
Missing data
Invalid data
Incomplete data
The Information Gap
The shortfall between gathering information and using
it for decision making.
Firms have inadequate data warehouses.
Business Analysts spend 2 data a week gathering
and formatting data, instead of performing analysis.
(Data Warehousing Institute).
Business Intelligence (BI) seeks to bridge the
information gap.
What is Big Data?
Massive sets of unstructured/semi-structured
data from web traffic, social media, sensors, etc.
Petabytes, exabytes of data
Volumes too great for typical DBMS
Information from multiple internal and external
sources:
- Transactions
- Social Media
- Enterprise Content
- Sensors
- Mobile Devices
In the last minute there were…….
- 204 million emails sent
- 61,000 hours of music listened to on
Pandora
- 20 million photo views
- 100,000 tweets
- 6 million views & 277,000 Facebook Logins
- 2+ million Google searches
- 3 million uploads on Flickr
pf3
pf4

Partial preview of the text

Download UNDERSTANDING BIG DATA AND DATA ANALYTICS and more Lecture notes Business Fundamentals in PDF only on Docsity!

bUsIness AnaLYtICs

Lesson # 3

UNDERSTANDING BIG DATA AND DATA ANALYTICS

Business Analytics, BI, Big Data, Data Mining-What’s the Difference?  Business Analytics – tools to explore past data to gain insights into future business decisions.  BI – tools and techniques to turn data into meaningful information.  Big Data – data sets that are so large or complex that traditional data processing applications are inadequate.  Data Mining – tools for discovering patterns in large data sets. Businesses Need Support for Decision-Making  Uncertain economics.  Rapidly changing environments  Global competition  Demanding customers  Taking advantage of information acquired by companies is a Critical Source Factor. Characteristics of Data for Good Decision-Making ---Better Quality Data-Characteristics ACCURACY – Accurate enough for intended purposes:

  • Balance with use, cost, effort, timeliness
  • Capture close to point of activity
  • Make accuracy compromises clear VALIDITY – Compliance with requirements:
  • Application of definitions
  • Consistency over time
  • Consistency with others RELIABILITY – Collection processes consistent:
  • …over time
  • …for multiple collection points
  • …between collection systems TIMELINESS – To influence decisions:
  • Capture quickly after the event
  • Available quickly enough
  • Available frequently enough RELEVANCE – Data relevant to intended purpose:
  • Periodic review of requirements
  • Quality assurance and feedback process
  • Use carefully for other purposes COMPLETENESS – Monitor quality to match data needs:
  • Missing data
  • Invalid data
  • Incomplete data The Information Gap The shortfall between gathering information and using it for decision making. ➢ Firms have inadequate data warehouses. ➢ Business Analysts spend 2 data a week gathering and formatting data, instead of performing analysis. (Data Warehousing Institute). ➢ Business Intelligence (BI) seeks to bridge the information gap. What is Big Data?  Massive sets of unstructured/semi-structured data from web traffic, social media, sensors, etc.  Petabytes, exabytes of data  Volumes too great for typical DBMS  Information from multiple internal and external sources:
  • Transactions
  • Social Media
  • Enterprise Content
  • Sensors
  • Mobile Devices  In the last minute there were…….
  • 204 million emails sent
  • 61,000 hours of music listened to on Pandora
  • 20 million photo views
  • 100,000 tweets
  • 6 million views & 2 77,000 Facebook Logins
  • 2+ million Google searches
  • 3 million uploads on Flickr

Big Data

  • Companies leverage data to adapt products and services to:
    • Meet customer needs
    • Optimize operations
    • Optimize infrastructure
    • Find new sources of revenue
    • Can reveal more patterns and anomalies
  • IBM estimates that by 2015, 4.4 million jobs will be created globally to support big data.
    • 1.9 million of these jobs will be in the United States. Where does Big Data come from? Most big data efforts are currently focused on analyzing internal data to extract insights. Fewer organizations are looking at data outside their firewalls, such as social media. Internal Data Sources External Data Sources 88% Transactions 73% Log Data 57% Emails 43 % Social Media 38 % Audio 34 % Photos and Videos Types of Data
  • When collecting or gathering data we collect data from individuals’ cases on particular variables.
  • A variable is a unit of data collection whose value can vary.
  • Variables can be defined into types according to the level of mathematical scaling that can be carried out on the data. Levels of Measurements
  1. NOMINAL – named variables
  2. ORDINAL – named + ordered variables
  3. INTERVAL – named + ordered + proportionate
  • interval between variables
  1. RATIO – named + ordered + proportionate + interval between variables + can accommodate absolute zero. CATEGORICAL (NOMINAL) DATA
  • Nominal or categorical data is data that compromises of categories that cannot be rank ordered-each category is just different.
  • The categories available cannot be placed in any order and no judgement can be made about the relative size or distance from one category to another.
  • Categories bear no quantitative relationships to one another. Examples:
  • customer’s location (America, Europe, Asia)
  • employee classification (manager, super- visor, associate)
  • What does this mean? No mathematical operations can be performed on the data relative to each other.
  • Therefore, nominal data reflect qualitative differences rather than quantitative ones. CATEGORICAL DATA Nominal Categorical Data Ordinal Categorical Data Group of non-parametric data. Example: Random hair color selection. Group of ordered non- parametric data. Example: User experience ratings. NOMINAL DATA ➢ Systems for measuring nominal data must ensure that each category is mutually exclusive and the system of measurement needs to be exhaustive. ➢ Exhaustive: the system of categories system should have enough categories for all the observations. ➢ Variables that have only two responses i.e. Yes or No, are known as dichotomies. Nominal data divides variables into mutually exclusive, labeled categories. Examples:

THE ROLE OF BIG DATA IN DIGITAL MARKETING