(Ch 19, stat 1040) | Term | Definition | | ---- | ---- | | Qualitative | A descriptive value (red, blue, high, low) | | Quantitative | A numerical value (7, 8, 9) | | Population | The entire set of existing units that investigators wish to study | | Sample | A portion or subset of the population | | Parameter | A number that describes a characteristic of a *population* (*10%* of US senators voted for something) | | Statistic | A number that describes a *sample* characteristic (*71%* of Americans feel that ...) | > A global consumer survey reported that 6% of US taxpayers used or owned cryptocurrency in 2020. The US government is interested in knowing if this percentage has increased. The University of Chicago surveys 1,004 taxpayers and finds that 13% have used or owned crypto in the past year (2021) In the above example: - The *population* was *US taxpayers* - The *parameter* was *6%* - The *sample* was *1004 taxpayers* - The *statistic* was *13%* An ideal sample will represent the whole population. ## Sampling | Sample Type | Description | | ---- | ---- | | Simple random | Advantages:
- Procedure is impartial
- Law of Averages
Disadvantages
- Not always possible
- Can be very expensive | | Quota Sampling | Attempts to get certain proportions based on key characteristics. Quota sampling doesn't guarantee that the selection is an accurate representation. | | Cluster Sampling | Divide population into subgroups, randomly select a subgroup, and sample all of the subjects in that group | | Convenience | Sampling done near to the researcher because it's easier. | ## Simple Random Samples ## Bias | Bias Type | Description | | ---- | ---- | | Selection | When the procedure that selects the sample is biased | | Non-Response | Those that don't respond to a survey may have different characteristics than those that do respond | | Response | When the question is worded in a leading way to elicit a certain response. | | Volunteer response | Self selecting, individuals volunteer to answer | | Measurement | Interviewing method influences the response, uses loaded words or ambiguities. | (Ch 20, stat 1040) The expected value for a sample percentage equals the population percentage. The standard error for that percentage = `(SE_sum/sample_size) * 100%` To determine by how much the standard error is affected, if `n` is the sample size, the standard error changes by $\frac{1}{\sqrt{n}}$ Accuracy in statistics refers to how small the standard error is. A smaller standard error means your data is more accurate. You can use the below equation to find the percentage standard error of a box model that has ones and zeros. the % of ones and zeros should be represented as a proportion (EG: `60% = 0.6`). $$ \sqrt{\frac{(\% of 1)(\% of 0)}{num_{draws}}} $$