Beware, this post contains statistics.

So, for the past few weeks, after getting a 63% on my stats midterm, I've been a bit of a downer on stats. I mean, can you blame me? But I've decided maybe I'm catching on to things after all! For my deviance class, I'm doing my term paper on comparing high school students' use of alcohol and marijuana across the US, Denmark, Finland, France, Germany, Turkey, and the UK. I decided to be brave and include statistical analysis, not using an easy, pansy-pants, point-and-click statistics program, but the hardcore program we use in our stats class. After way too many hours, a few tears, some help from Chad, and lots of Skittles to boost my brain power, I came up with something cool! And I'll show you using my nifty graphs. =) Lol, humor me please.

Okay, look at this graph:

So, we're looking at the total percent of students who said they've consumed 3 or more alcoholic beverages in their lives. I didn't include the kids who said 1-2 because I don't think having one drink doesn't mean you drink. The European kids are ages 15-16, and the Americans are 17-18, so they're all underage. Turkey's definitely the lowest, but the US is pretty low too. Now, let's pull out the fancies.

I ran a t-test, and it spat this out:

One Sample t-test

data: res

t = 12.2813, df = 6, p-value = 1.776e-05

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

67.0581 100.4276

sample estimates:

mean of x

83.74286

Basically, anything outside the values 67 and 100 is significant. Turkey's at 46%, so it's significantly lower, which is good, that's what we want. But the US doesn't make the cut-off with its 77%. Well, then, let's just throw out Turkey to see if it's messing up the curve. Besides, Turkey has a whole different culture than all the others.

When we ditch Turkey we get this:

One Sample t-test

data: res

t = 28.9162, df = 5, p-value = 9.27e-07

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

82.02959 98.03707

sample estimates:

mean of x

90.03333

Yay! The confidence interval is 82-98, so the US at 77 is significantly lower than the others! That means it's low enough that it's actually worth investigating and explaining. The difference isn't just because of a goof in what students they sampled, there's some major difference in the countries and their cultures.

Alright, just one quick look at marijuana use because I found some surprising things!Yikes, the US is way up there on the marijuana scale! We'll skip the analysis with Turkey, and just go to the good stuff:

One Sample t-test

data: res

t = 5.9831, df = 5, p-value = 0.001870

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

17.89975 44.86691

sample estimates:

mean of x

31.38333

The US, with 48% of its students having smoked marijuana is significantly higher than the other countries. That means the high percentage is so much higher than the others that it's not due to sampling error or random chance or fluke. There's something going on here, that's what the crazy t-test means. US is also significantly higher than other countries for having smoked marijuana in the last 12 months AND the last 30 days. Weird!!

So that's my project, investigating these differences. Plus, I'm trying to get prepped for my stats final at the same time. Fingers crossed that both aren't complete failures, lol.

While Facebook stalking, I found this link to your blog, saw pretty graphs, read the article and was like "woah! that's awesome!", and thus had to leave a comment.

ReplyDeleteYay statistics!