LeydenStats: 2011

Monday, December 12, 2011

1 in a trillion, I think not.

Check out this article on what could incorrectly be considered Bernoulli trials.

http://www.bbc.co.uk/news/magazine-16118149

Independence matters!

Friday, December 9, 2011

Ch 16 - What you should know.

Ch 17 Summary

You should know what a Bernoulli Trial is, and when doing a problem check for these conditions (that means write out how you know it meets these conditions)

There are two possible outcomes
The probability of success is constant
The trials are independent (you can use the 10% condition if they are not truly independent)

You should know when to use a Geometric Probability model, geom(p).

For a random variable that counts the number of Bernoulli Trials until the first success

You should know when to use a Binomial model, binom(n,p)

For a random variable that counts the number of success in a fixed number of Bernoulli Trials.

You should know about the success failure Condition

For a Normal model to be a good approximation of a Binomial model, we must expect at least 10 success, and at least 10 failures. np>=10, and nq>= 10

Calculator skills

geometpdf(p,x) returns the probability of an individual outcome
geometcdf(p,x) returns the cumulative probability from 0 to x. Use for at least or at most problems.
binompdf(n,p,X) returns the probability of an individual outcome
binomcdf(n,p,X) returns the cumulative probability from 0 to X. Use this in At least situations, or at most situations.

Tuesday, December 6, 2011

Here are some more problems to test your knowledge of Bernoulli trials, geometric and binomial probability models.

Ch 17 practice problems

Friday, December 2, 2011

Chapter 17 Guiding Questions

1) What is a Bernoulli Trial? What are the requirements for Bernoulli Trials?

2) What is a geometric probability model for Bernoulli trials? Explain and give the formulas for mean and standard deviation.

3) What is a binomial probability model for Bernoulli trials? Explain and give the formulas for mean and standard deviation.

4) What is the success/failure condition and when is it used with binomial models?

For the following questions: Tell me if this is a geometric or binomial model. Then solve the problem.

5) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that the first lefty is the fifth person chosen.

6) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that there are some lefties among the 5 people.

7) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that the first lefty is the second or third person.

8) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that there are exactly 3 lefties in the group.

9) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that there are at least 3 lefties in the group.

10) Assume that 13% of people are left handed. If we select 5 people at random, what is the probability that there are no more than 3 lefties in the group.

Monday, November 28, 2011

Chapter 16 Guiding Questions

Consider the following questions:

1) What is a discrete random variable?

2) What is a continuous random variable?

3) How do you make a probability model for a random variable?

4) What is the expected value of a discrete random variable? How do you calculate the expected value?

5) How do you find the standard deviation of a discrete random variable?

6) What happens to the expected value and variance when...

a) we add or subtract a constant

b) we multiply by a constant

c) we add together two independent random variables

d) we subtract two independent random variables

7) When two independent continuous random variables have Normal models, what is true about the sum or difference of the random variables?

8) Can we add standard deviations together when dealing with random variables?

Wednesday, November 23, 2011

Comments on the General Addition rule

Actually it works for both, when events are disjoint P(AB)= 0. (in other words since they are disjoint, they don’t overlap so the probability of both A and B happening is zero). And this is the addition rule that works for non-disjoint events, because you have to subtract the overlap, otherwise you are counting that part twice. For example, lets say there are 50 students taking AP stats at East and West. Of those 50 lets define event A as being on an academic team. And event B as being on an athletic team. These events will likely overlap since some people just get involved in everything. For this example lets say there are 20 students out of the 50 on an academic team, 10 on an athletic team, and 4 on both. To determine the probability of being in on an athletic team or an academic team, we need to know the total students on athletic teams or academic teams. So 20 + 10, this could also be written as (16 + 4) + (6 + 4), the 16 are students only on an academic team, the 4 are on both, the 6 are only on an an athletic team the 4 are on both. But the problem is the 4 are the same 4 in each parenthesis. We should not count them twice. So we would do (16+4)+(6+4)-4 or 20+10-4 to get 26 students in athletics or academics, the probability would be 2650= .52 This was done with counts but the same could have been done with percents (.4) + (.2) - .8 = .52.

Sunday, November 20, 2011

Chapter 15 Guiding Questions

1) What is the general addition rule for probability? Does this rule work when events are disjoint AND when events are not disjoint?

2) What is conditional probability? What is the formula for conditional probability?

3) What is a contingency table and how can it help us when dealing with probability?

4) How can we mathematically check to see if two events are independent?

5) What is a tree diagram and why do we use these diagrams when dealing with probability?

Thursday, November 17, 2011

Ch 14 Day 3

Wednesday, November 16, 2011

Ch 14 Day 2

Chapter 14 Guiding Questions

1) Define the following: Probability, trial, outcome, event, independent

2) What is the "law of large numbers"?

3) Give some general information about probability. Please mention things like the complement rule, the addition rule, the multiplication rule, and mutually exclusive.

4) What is the difference between mutually exclusive and independence in probability? Can events be both mutually exclusive and independent?

Sunday, November 13, 2011

Review Answer Submissions.

Tuesday, November 8, 2011

There are lots of reasons why we can't do experiments for every situation. If you have taken a psychology or sociology class you may have heard of the Milgram Experiments. Some factors that people are interested in manipulating or testing can cause harm on the participants, which makes them potentially unethical. This is one famous example.

Some survey ideas from Mr. Rossi

1) Cyberbullying/bullying

2) Media and self image

3) Sleep times

4) Extracurricular commitments vs. academics

5) Social network participation

6) Poverty and education (or anything about common distractions at home.)

These next ones are non repeats, but kids would like support.

5) Teaching and humor, does it help in the classroom.

6) Effect of hollywood/disney on attitudes toward romance

7) Attitudes and perspectives of and about Muslims post 9/11

8) Parents adding stress to students by being demanding or strictness/lenience of parents

Sunday, November 6, 2011

Chapter 13 Guiding Questions

Here are a few questions to look at for chapter 13.

1) What is an observational study?

2) What is the difference between a retrospective and prospective observational study?

3) What is an experiment?

4) In an experiment, there are a many important things to consider. Please define the following terms: experimental units, factor, response variable, treatments, level, single-blind, double-blind, control group, placebo, matched pair design, statistically significant

5) What are the four principles of experimental design? Describe each principle.

Also please notice the AP Stats Survey Assignment posted in the links section -->

Wednesday, November 2, 2011

Questions About Sampling Methods

Complete the form found here

Units I & II Review Assignment

In addition to the work you are doing in Unit III, it is important to keep reviewing material from the previous units.

Here is a link to a few review problems that are due on 11/10/11

Unit I & II Review

Friday, October 28, 2011

Thursday, October 27, 2011

Chapter 12 Guiding Questions

1) Please define/describe and give an example of each of the following terms:

Simple Random Sample (SRS)

Stratified Sample

Cluster Sample

Systematic Sample

Census

Undercoverage

Nonresponse bias

Response Bias

2) What is a voluntary response sample and what is wrong with this type of sampling?

3) What is convenience sampling and what is wrong with this type of sampling?

4) Why is your sample size important when conducting a survey?

5) Why is randomizing important when conducting a survey?

6) What is the difference between a parameter and a statistic?

Monday, October 24, 2011

Chapter 11

How do you conduct a simulation in statistics?

What do you need to be careful of when running a simulation (What can go wrong)?

How can you run a simulation in your calculator?

For the following examples, please describe how you would run the following simulation, and then actually run the simulation.

Problem 1
You are about to take the road test for your driver's license. You hear that only 40% of candidates pass the test the first time. Suppose 20 people go to take their road test in one day. Run a simulation to find the number of people that pass their road test that day.

Problem 2
Recall the hiring discrimination problem from day 1 of class. We ran a simulation using red and white beads. Run two different simulations for this situation. (The details if you don't have the paper with you, was 25 people up for a certain job, 15 male, 10 female. A lottery is done to choose 8 of the 25, the results were that 6 females and 2 males are chosen. Do we suspect the lottery was rigged?)

Problem 3
Describe an event that can be modeled by a simulation. Plan and run the simulation.

Thursday, October 20, 2011

Ch 10 Problems Explined

Sorry about the tech issues yesterday with the Google form.

Since I wasn't able to see what problems were the biggest issues so I created these videos to explain the solutions. They were made on the iPad, so they may be a little shaky.

Don't watch them all, just watch the ones you think will help you most. If you are having troubles with logs, you may want to watch the Kahn Academy video at the end.

With about 25 minutest to go in class you will take a quiz. So use your time wisely in the beginning of class.

Here is an explanation of the pressure problem... (The direct link should allow full screen if it doesn't work within the blog)

about 4 minutes, Link the video: http://youtu.be/eZd5yjZlZ2I

Here are the life expectancy problems (about 7 minutes) Link to the video: http://youtu.be/Z1_WXSIerrE

And the baseball salaries problem (note this one shows the work for logs at the end that were not discussed at the end of the life expectancy problem, about 8 minutes).

Link to the video http://youtu.be/DwaWkd94nGE

And finally not my video, this comes from the Kahn Acadmy

Wednesday, October 19, 2011

Chapter 10 Practice Problems

Hello students, today I would like to work on the practice problems found here (or that will be handed out in class)

Submit your answers here. I will look over your results.

The site has a couple of resources that you may find helpful. The StatTrek site has a video tutorial that is pretty nice, the youtube video relates specifically to using the TI-84 for transforming data.

Thursday, October 13, 2011

Chapter 10 Guiding Questions

Here are a few questions you need to address:

1) What is the Ladder of Powers? (make a chart and explain what each part of the ladder means)

2) What are the four goals of re-expressing data?

3) What are exponential, logarithmic, and power models and what are their roles in re-espressing data?

To help answer these questions use the Ladder of Powers_Aligators.ftm to explore the ladder of powers.

Wildlife biologists can fairly accurately determine the length of an alligator from aerial photographs or from a boat. Determining the weight of an alligator from a distance is much more difficult. Wildlife biologists in Florida captured 25 alligators in order to collect data and to develop a model from which weight can be predicted from length. The data set (in case fathom isn't working on your computer) alligator.txt contains the resulting 25 measurements, the first variable is the alligator's weight (in pounds?) and the second is its length (in inches?).

Create a scatterplot of the raw data.
Play around with the scatterplot, swapping out the values for your x and or y variables with the the calculated re-expressed values, such as y-squared, sqrt_y, sqrt_x..., in order to get a scatter plot that is approximately linear.
When you have a scatter plot that is approximately linear, create a least squares regression line (LSRL) by right clicking on the scatter plot and selecting "Least Squares Line
Using Google docs, explain how you can uses this model to predict the weight of an aligator with length 140 inches.

Log into google apps. One team member should create a file "Group # Alligator Problem"
Then in the upper right hand corner of the document, click "share" Add your group members email addresses and your teacher's email address.
Be sure to include images from fathom, that show the original data, re-expressed data, LSRL, residual plots, and an explanation of how you find the predicted weight of a 140 inch alligator.

Tuesday, October 11, 2011

Multiple Choice Practice for Test

Sunday, October 2, 2011

Chapter 9 Guiding Questions

Here are a few questions to consider for chapter 9:

What should we always look at before deciding a linear model is a good fit for our data?

What should we do if there looks to be subsets in our data?

What is extrapolation and why is it dangerous?

Describe/define high leverage points

Describe/define outliers.

Describe/define influential points.

Describe a point on a scatterplot that would have a large residual.

What is wrong with working with summary values when comparing two quantitative variables?

Monday, September 26, 2011

Ball Bounce Lab

As you work through the ball bounce lab, there are a couple of topics that will require you to do a little research.

Residuals

What is a residual?
How do you make a residual plot on your TI-83? or look here (if you have headphones).

Writing the equation of the LSRL by hand.

What point does it always pass through?
How can we determine slope?

Interpreting The value of R^2.

What does R-squared mean?

Do your best. Feel free to contact me if you have questions.

Friday, September 23, 2011

Ch. 8 Linear Regression Guiding Questions

When would you use linear regression?

What are residuals?

How do you know if a regression line is a good fit?

How can you get a regression equation? With technology? Without technology?

What assumptions and conditions must be met to procede with linear regression?

What is a residual plot?

Is a regression line a perfect predictor?

Does a regression line mean that all of the change in the response variable is due to the change in the explanatory variable?

Using the data from the # of beers and Blood Alcohol level. Calculate the LSRL.

Thursday, September 22, 2011

Can you guess the correlation?

Go to the following website to get practice with matching scatterplots to correlation values.

http://istics.net/stat/Correlations/

Wednesday, September 21, 2011

Correlation and Lurking Variables

Please comment on the following questions:

1) What is the difference between correlation and association?

2) If a scatterplot shows a very strong linear relationship, what value(s) should the correlation be close to?

3) If a scatterplot shows a very weak linear relationship, what value(s) should the correlation be close to?

4) Is it possible to have a strong association, but a weak correlation?

5) What is a lurking variable? What kind of affect can it have on the relationship between two quantitative variables?

Tuesday, September 20, 2011

How fast can you write?

You and a partner will collect data on how fast you can write as sentence with your dominant hand, and then again with your non-dominant hand?

And then describe the relationship between between righting speeds for each hand. Use either your phone, the wall clock or an online timer and then record your data in the google form below.

After the data is collected (we should have data from both east and west campuses) analyze the data using a scatterplot.

Be sure to comment on direction, form, strength, and unusual features. Do you think there is an association, what about correlation?

Here is the sentence you will write. First with your dominant hand, and then again with your non-dominant hand.

“Mos Eisley spaceport. You will never find a more wretched hive of scum and villainy.”

You can view the results of the form here.

Make a copy of the file before you make a scatterplot.

You can learn how to make a scatter plot in google docs here.

Monday, September 19, 2011

Exploring Bi-variate Data Day 2

Some things to think about...9/20

posted by Ms. McCarthy

Here are a few questions I would like you to answer on your team wikipage today

1) Define explanatory variables and response variables.

2) What four things do you need to mention when describing a scatterplot?

3) In context, please describe the following scatterplot:

This is data collected from students in statistics classes including their heights (inches) and weights (pounds).

4) What conditions must be met for correlation and how do you check those conditions?

5) A Statistics teacher is collecting data in an introductory statistics course about the average number of hours students studied each week and their college GPA. Which variable would you use as the explanatory variable and which as the response variable? Why?

6) How can you make a scatterplot using your graphing calculator? Where can you find the correlation?

Sunday, September 18, 2011

Unit II Exploring Bivariate Data

The first Major Topic in AP Statistics is Exploring Data. In Unit I we explored Categorical and Quantitative Data (univariate). In this unit we will look at how two quantitative variables relate.

From the AP Statistics Topic list, here is what we will be focusing on in this unit.

I. Exploring Data: Describing patterns and departures from patterns (20%-30%)

D. Exploring bivariate data

Analyzing patterns in scatterplots (Chapter 7)
Correlation and linearity (Chapter 7)
Least-squares regression line (Chapter 8)
Residual plots, outliers, and influential points (Chapter 9)
Transformations to achieve linearity: logarithmic and power transformations (Chapter 10)

All homework will be through Math XL, assignments have been set up for each unit. Due dates are set.

To get you started consider the following...

In February 1986, 16 students at The Ohio State University Participated in an experiment to explore the relationship between Blood Alcohol Level (BAC) and other variables such as amount of alcohol consumed, weight, gender and age. 16 students participated in the experiment. OSU Police administered a breathalyzer to verify initial BAC was zero. They were randomly assigned by drawing a ticket from a bowl the number of beers to be consumed (1 to 9) of 12 oz beers. Thirty minutes after consuming their final beer, students took another BAC.

How might you analyze this data? Is there a relationship between any of the variables? Don't forget make a picture, make a picture, make a picture.

What about this data?

Tuesday, September 13, 2011

Test Moved to Friday!

It is official, the test is Friday.

But don't forget to read the post below and complete the lab from class today.

Is the distribution approximately Normal?

You will work on a lab today in class, but part of what you will need to do is justify whether or not the distribution is approximately Normal or not.

Some things to think about.

How does your data fit the 68-95-99.7 rule?

What is a normal probability plot? How can that help?

Here are your instructions for tonight...

You should have completed your data collection. If you did not, then you will need to proceed with your data set (it should be fine).

Using your data set, complete #2 on the lab worksheet.

To answer #3 see the above questions to guide your work.

For #4 Use this google doc which was shared with you via email. Enter your data at the bottom of the list in column A. I started it off with two entries (I did the trials in my office, so they are not fake entries). In columns C and D you will see a summary table of the data. This will be updated real time as you enter your data. Use this for your histogram for number 4. (Note: not everyone will have the same totals unless we all wait for the last person to enter his/her info. As long as there is at least 140 trials you may procede with your histogram. It may be interesting to see the differences, Also since you and your partner have the same data please record in column B your name and your parnter's name, so that your partner knows he or she doesn't need to enter the data).

Continue with the rest of the worksheet.

Text me using the number to the right if you have questions.

Mr. Babel

Thursday, September 8, 2011

Comments on the spot check...

The scores of the ACT are approximately normally distributed with a mean of 18 and a standard deviation of 6. According to Illinois State Universities Admissions website, the top 25% of their students scored a 27 or higher. *What percent of the students that take the ACT scored lower than the 27?

The most commonly chosen answer was 75%. This was probably based on the information about ISU, the top 25% scored 27 and higher, so the bottom 75% must have scored lower. But the question is about the percentage of students that take the ACT, not the percentage of students accepted into ISU.

For this we need to know how many standard deviations away from the mean 27 is, in other words the z-score. z =(27-18)/6 = 1.5.

The percent of students that scored below 27, is the same as on a Standard Normal model, the percent of a z-score being less than 1.5.

There are multiple ways we can find this information. One would come from the diagram I saw a couple of groups posted on the wiki.

If you add up all the percentages from the left all the way up to 1.5 (or just do 100 - the percents to the right) you get 93.3%. So about 93% of all scores are less than a 27.

You could also use your calculator, there is a function called normalcdf().

And finally you have a table in the back of the book (something you would also have with you when you take the AP Test).

Again assuming a Normal model with mean 18 and standard deviation 6, what percent of students would score between 24 and 30? *

This problem can be done using the graph above or the less complicated 68-95-99.7 rule/picture.

For the ACT, 24 is exactly 1 standard deviation above the mean, and 30 is exactly two standard deviations above the mean. So we can use the 68-95-99.7 rule.

Since 95 percent of the data is between 6 and 30, that means 47.5% between 18 and 30.

Since 68 percent of the data is between 12 and 24 that means 34% is between 18 and 24.

47.5% - 34% = 13.5%.

Wednesday, September 7, 2011

I'm not in class today

A message from Mr. Babel to the most amazing second period class (at East Leyden, in room 139, in the school year 2011-12) in the universe.

You may or may not have noticed, but I'm not in class with you today. I have a meeting that I needed to attend. Thank you to Ms. Kinnane for subbing.

1. Your task for the day. Continue with the Normal model and z-scores. I've posted comments on your work on the wiki, for many groups that will serve as a guide to how to move forward.

First after you and your teams check out the wiki, I want you to take about 5 minutes talking to someone from another team. See what they have learned, share what you have learned.

Then comeback to your groups and share what you found out from another group.

2. You have a worksheet of problems, can you answer those yet? If not continue to look for ways to do so.

You may need to google "normalcdf" TI-83 Normal model. This could help with finding specific percentiles.
Or the ti-83 lab that I posted yesterday could be helpful.

3. Here is the full topic list that I posted on Tuesday.

What are z-scores? How are they used?
How does rescaling data affect shape, center and spread?
What is a Normal Model?
What about the Standard Normal Model?
When can we use the Normal Model? Under what conditions?
What does the 68-95-99.7 Rule mean?

How are you doing on this list? Does your wiki address all of these questions?

4. Take a quiz that will be emailed to you in about 15 minutes (not a real, quiz more of a spot-check) I just want to see how you are doing, it isn't for a grade.

5. Have a nice day

6. Did you notice the phone number on the left side of the screen? (Ok fine my left your right) I set up this google voice number so that you can text me when you have questions. I'm old, I email. Your young you text. This can help bridge that gap. Your texts will come to my email, and my email response will be texted back to you. If you call that number it would ring my office phone. But keep in mind you will need to identify yourself by name in the text message. You have my permission to text me today during class (make sure Ms. Kinnane reads this so you don't get in trouble). I will try to respond from my meeting (but only if it doesn't cause a distraction, because that would be rude of me).

7. You should be continuing to work on things outside of class, I should be seeing posts on the wiki, or comments on the blog, and as of today texts from you.

8. Check out my blog post from yesterday, some awesome student created videos from some NJ stats students. Amazing work by these students.

Awesome!
Check out these student created videos on statistics, most of them are from a high school in NJ...

http://www.youtube.com/MrAPStatistics

Very Helpful...

Thanks to Joe S. for posting this on his team's page...

http://www.regentsprep.org/Regents/math/algtrig/ATS2/NormalLesson.htm

Additional Resources for learning about the Normal Model

Here is an activity using your TI-83/84

Here are a lot of links related to the normal model

And don't forget to check what your classmates are posting on the wiki.

Tuesday, September 6, 2011

Check out what your classmates put on the wiki

Based on the different conversations I was listening to during class you should take a look at the project wiki and see what other teams put together. I think that if we put it all together, you would have an even better understanding of the Normal Model.

Monday, September 5, 2011

Just how normal are you?

The mean height of men (20 years and over) in the US is approximately 69.4 inches, with a standard deviation of 3.1 inches.

The mean height of women (20 years and over) in the US is approximately 63.8 inches, with a standard deviation of 2.7 inches.

According to the CDC.

But what does that mean? How "interesting" is your height? What would make a height interesting?

Personally, I (Mr. Babel am about at the 50th percentile for height. How can the above information be used to determine my height.

This chapter is about the normal model and z-scores. You will investigate these topics and be able to answer questions like the ones above.

What are other measurements that we talk about in terms of percentiles? How are these percentiles calculated?

Go to the AP Stats wiki. You have been assigned to a team study and post what you have learned about the normal model.

Friday, September 2, 2011

Why Statistics Matters

As we approach the tenth anniversary of the 9/11 attacks reports of a possible association between serving at ground zero and cancer is in the news.

Not recognizing that there is an association means that workers and their families of the workers at 9/11 do not receive financial support for treatment of their cancer.

Thursday, September 1, 2011

Looks like we could do a little more updating on our class wiki page.

I've gone through and updated the home page to be more of a navigation page to other areas. I've linked the work some of you have done directly to the text I copied directly from the College Board AP Statistics Topic list.

Take a few minutes to add some important details to pages like shape, center, spread.
Add a page called box plots (or edit it if someone beats you too it).
How about gaps and outliers,
IQR, and how to determine if a data set has outliers.
mean, and standard deviation, median and IQR and how to decide.

Just click the link to the wiki and have at it.

Tuesday, August 30, 2011

Post your interesting data

Use the wiki spaces page Quantitative Data.

Thursday, August 25, 2011

Here is a site to help with Stem and Leaf plots.

Here is a Simpson's Paradox example, Who is the better shooter?

Here is a link to more Simpson's Paradox examples like the one above.

Wednesday, August 24, 2011

Homework 8/25/11

Sunday, August 21, 2011

M&M Lab Results - What do you think? Are Color and Type of Candy Independent?

Thursday, August 18, 2011

Moving...

Go here...

Tuesday, August 16, 2011

Welcome to AP Statistics

Here is the video from the first day. This comes courtesy of @natewright03.

It will be interesting to see if it all looks a little more familiar when the year is done.

Sunday, February 13, 2011

Statistics Videos

Need help with any of the topics we've been covering in class?

Go to this website: http://www.khanacademy.org/ and scroll down to the statistics section. There are some great videos that will help you with confidence intervals, hypothesis testing, and can help you review for the AP exam!