© 2014 Pacific Crest
249
5.4 Transforming Data
3.
Air Pollution and Mortality
Story Name
: Air Pollution and Mortality
Story Topics
: Environment
Methods
: Outlier, Transformation, Regression
Abstract
: Researchers at General Motors collected data on 60 U.S. Standard Metropolitan
Statistical Areas (SMSA’s) in a study of whether air pollution contributes to mortality.
The dependent variable for analysis is age adjusted mortality (called “Mortality”). The
data include variables measuring demographic characteristics of the cities, variables
measuring climate characteristics, and variables recording the pollution potential of three
different air pollutants.
Perform and document each of the following tasks using this data:
a. Clean the data.
b. Group the data by state.
c. Determine the state with the highest and lowest average rainfall.
d. Graphically display information about the relationship between median income and education
by size of city population.
e. Determine the relationship between population density and median income.
f.
Challenge:
Compare the average median income for the cities in a state and the average
median income of the state in which those cities are located.
H
ardest Problem
How hard
can
it be? Can you still use what you’ve learned?
Based on the Models, the Methodologies, and the Demonstrate Your Understanding (DYU) problems in
this activity, create the
hardest
problem you can. Start with the hardest DYU problem in this experience
and by contrasting and comparing it with the other DYU problems, play “What if” with the different
conditions and parameters in the various problems.
Can you still solve the problem? If so, solve it. If not, explain why not.
What are the conditions and parameters that make a problem where you must transform data a difficult
problem to solve?
T
roubleshooting
Find the error and correct it!
In analyzing the Minimum Wage data set, what issues about the transformations would you want to
explore and test before you accepted the transformations they produced?
M
aking it Matter
Solving problems in your life
●
Think about your current or last science lab course. What are examples of data derived from
experimental data? What transformations would/could/should you use and how do you present each
example data set for use or review by others?