Statistics is the collection 🐷and analysis of da🍃ta drawn from a sample group to study and interpret their applicability to the larger population.
What Is Statistics?
Statistics is a branch of applied mathematics that involves the collection, description, analysis, and interpretation of data drawn from a sample of a larger population. Statistical sampling is used in medicine, finance, marketi𒅌ng, and many other fields to increase understanding and inform decision-making.
The mathematical theories behind statistics rely heavily on differential and integral calculus, linear algebra, and probability theory.♐
Key Takeaways
- Statistics involves calculating mathematical probabilities based on data collected from a sample group.
- The two major areas of statistics are descriptive and inferential.
- The work of statisticians is used in virtually all scientific disciplines as well as in finance, medicine, the humanities, government, and manufacturing.
:max_bytes(150000):strip_icc()/statistics-ae8f1320de174c279eeeee49c4087917.jpg)
Dennis Madamba / Investopedia
Understanding Statistics
Statistics are used in virtually all scientific disciplines such as the physical and soꦰcial sciences as well as in business, medicine, the humanities, government, and manufacturing. Statistics is a branch of applied mathematics including calculus and linear algebra that developed from the application of mathematical tools to probability theory.
It's the idea that we can learn about the properties of large sets of objects or events (a 澳洲幸运5官方开奖结果体彩网:population) by studying the characteristics of a smaller number of similar objects or events (a sample). Gathering comprehensive data about an entire population is too costly, difficult, or impossible i𝄹n ma🍎ny cases so statistics start with a sample that can be conveniently or affordably observed.
Statisticians measure and gather data about the individuals ♏or elements of a sample and analyze this data to generate descriptive statistics. They can then use these observed characteristics of the sample data to make inferences or educated guesses about the unmeasured characteristics of the broader population. These are known as the parameters.
Fast Fact
Statistics dates back centuries. An early record of correspondence between French mathematicians Pierre de Fermat and Blaise Pascal in 1654 is often cited as an early example of statistical probability analysis.
Descriptive and Inferential Statistics
The two major areas of statistics are descriptive statistics and inferential statistics. Descriptive statistics describes the properties of sample and population data. Inferential statistics uses those properties t🌸o test hypotheses and dr💧aw conclusions.
Descriptive statistics include mean or average, variance, skewness, and kurtosis. Infer🌞ential statistics include linear regression analysis, analysis of variance or ANOVA, logit/Probit models, and null hypo🌟thesis testing.
Descriptive Statistics
Descriptive statistics focus mostly on the central tendency, variability, and distribution of sample data. Central tendency refers to the estimate of the characteristics, a typical element of a sample or population. It includes descriptive statistics such as mean, median, and mode.
Variability refers to a set of statistics that show how much difference there is among the elements of a sample or population along the characteristics measured. It includes metrics such as range, variance, and 澳洲幸运5官方开奖结果体彩网:standard deviation.
The distributio✃n refers to the overall “shape” of the data. This can be depicted on a chart such as a histogram or a dot plot and includes properties such as the probability distribution function, skewness, and k𝄹urtosis.
Descriptive statistics can also describe differences bꩵetween observed characteristics of the elements of a data set. They can help us understand the collective properties of the elements of a data sample and form t🔯he basis for testing hypotheses and making predictions using inferential statistics.
Inferential Statistics
Inferential statistics is a tool used by statisticians to draw conclusions about the characteristics of a population. It's drawn from the characteristics of a sample. It's also used to determine how certain the statistician can be of the reliability of those conclusions. Statisticians can calculate the probability that statistics will provide an accurate picture of the corresponding parameters of the whole population from which the sample is drawn based on sample size and distribution.
🔜 Inferential statistics are used to make generalizations about large groups such as estimating average demand for a product by surveying the buying habits of a sample of consumers or attempting to predict future events. This might mean projecting the future return of a security or an asset class based on returns in a sample period.
澳洲幸运5官方开奖结果体彩网:Regression analysis is a widely used technique of statistical inference. It's used to determine the strength and nature of the relationship between a dependent variable and one or more explanatory or independent variables. The output of a regression mod🎐el is often analyzed for statistical significance. A result from findings generated by testing or experimentation isn't likely to have occurred randomly or by chance.
Statistical significance suggests that the results are attributable to a specꦅific cause explained by theꦅ data.
Important
Having statistical signifiꦬcance is important for academic disciplines🐬 or practitioners who rely heavily on analyzing data and research.
Mean, Median, and Mode
The terms “mean,” “median,” and “mode” fall under the umbrella of central tendency. They describe an element that’s typical in a given sꦓample group. You can find the mean descriptor by adding the numbers in the group and dividing the result by the number of data set observations.
The middle number in the set is the median. Half of all included numbers are higher than the median and half are lower. The median home value in a neighborhood would be $350,000 if five homes were located there and valued at $500,000, $400,000, $350,000, $3🐽25,000, and $300,000. Two values are higher and two are lower.
Mode identifies the number that falls between the highest and lowest values. It appears most frequently in the data set.
Understanding Statistical Data
The root of statistics is driven by variables. A variable is𝓰 a data set that can be counted that marks a characteristic or attribute of an item. A car can have variables such as make, model, year, mileage, colo♋r, or condition. Statistics allows us to better understand trends and outcomes by combining the variables across a set of data such as the colors of all cars in a parking lot.
澳洲幸运5官方开奖结果体彩网: There arꦡe two types of variables.
Qualitative Variables
Qualitative variables are specific attributes that are often non-numeric. Examples of qualitative variables in statistics include gender, eye color, or city of birth. Qualitative data is most often used to determine what percentage of an outcome occurs for any given qualitative variable. Qualitative analysis often doesꦚn't rely on numbers. Trying to determine what percentage of women own a business analyzes qualitative data.
Quantitative Variables
The second type of variable in statistics is quantitative variables. These are studied numerically and only have weight when they’re about a non-numerical descriptor. This information is rooted in numbers. The mileage a car is driven is a quantitative variable but the number 60,000 holds no value unless it's understood that it's the total number of miles driven.
Quantitative variables can be further broken into two categories. Discrete variables have limitations in statistics and infer that there are gaps between potential discrete variable values. The number of points scored in a football game is a discrete variable because there can be no decimals and a team can't score only one point.
Statistics also makes use of continuꦇous quantitative variables. These values run along a scale. Discrete values have limitations but continuous variables are often measured into decimals. Any value within possible limits can be obtained when measuring the height of the football players and t🦄he heights can be measured down to 1/16 of an inch if not further.
Statistical Levels of Measurement
There are several resulting levels of measurement after analyzing variables and outcomes. Statistics can quantif𒐪y outcomes in four ways.
Nominal-level Measurement
There’s no numerical or quantitative value in this measurement and qualities aren't ranked. Nominal-level measurements are instead simply labels or categories assigned to other variables. It’s easiest to think of nominal-level measurements as non-numerical facts about a variable.
Example: The na🦄me of the U.S. president elec🅠ted in 2020 was Joseph Robinette Biden Jr.
Ordinal-level Measurement
Outcomes can be arranged in an order using this measurement but all data values have the same value or weight. They’re numerical but ordinal-🌞level measurements can’t be subtracted against each other in statistics because only t🐽he position of the data point matters. Ordinal levels are often incorporated into nonparametric statistics and compared against the total variable group.
Example: American Fred Kerley was the second-fastest man at the 2020 Tokyo Olympics based on 100-meter sprint times.
Interval-level Measurement
Outcomes can be arranged in order in this measurement but differences between data values may now have meaning. Two data points are often used to compare the passing of time or changing conditions within a data set. There's often no “starting point” for the range of data values. Calendar dates or temperatures may not have a meaningful intrinsic zero value.
Example: 澳洲幸运5官方开奖结果体彩网:Inflation hit 8.6% in May 2022. The last time inflation was this high was in December 1981.
Ratio-level Measurement
Outcomes can be arranged𝄹 in order with this measurement and differences between data values now have meaning. There’s a starting point or “zero value” that can 🎀be used to further provide value to a statistical value, however. The ratio between data values has meaning including its distance away from zero.
Example: The lowest meteorological temperature recorded was -128.6 degrees Fahrenheit in Antarctica in 1983.
Statistics Sampling Techniques
It's often not possible to access data from every data point within a population to gather statistical information. Statistics relies instead on different sampling techniques to create a representative subset of the population that’s easier to analyze. There are several primary types of sampling in statistics.
Simple Random Sampling
澳洲幸运5官方开奖结果体彩网:Simple random sampling calls for everyꦦ member within the population to have an equal chance of being selected for analysis. The entire population is used as the basis for sampling and any random generator based on chance can select the sample items. Maybe 100 individuals are lined up and 10 are chosen at random.
Systemic Sampling
澳洲幸运5官方开奖结果体彩网:Systematic sampling calls for a random sample as w🐲ell but its technique is slightly 𓆉modified to make it easier to conduct.
A single random number is generated to determine the starting point and individuals are then selected at a specified regular interval until the sample size is complete. Every subsequent ninth individual is selected until 10 sample items have been selected if 100 individuals are lined up and numbered and th🍬e random starting point is the seventh individual. It would look like this💟: 7th, 16th, 25th.
Stratified Sampling
澳洲幸运5官方开奖结果体彩网:Stratified sampling calls for more control over your sample. The population is divided into subgroups based on similar characteristics. You would then calculate how many 🎉people from each subgroup would represent the entire population. Maybe 100 individuals are grouped by gender and race. A sample from each subgroup is then taken in proportion to how representative that subgroup is of the population.
Cluster Sampling
Cluster sampling calls for subgroups as well but each subgroup should be representative of the population. The entire subgroup is randomly selected instead of randomly selecting individuals within a subgroup.
Fast Fact
Not sure which Major League Baseball player should have won Most Valuable Player last year? Statistics is often used to determine value and is frequently cited when the award for the best player is announced. Statistics can include batting average, number of home runs hit, 🌼a𒊎nd stolen bases.
Uses of Statistics
Statistics is prominent in finance, investing, business, and a wide scope of sectors. Much of the information you see and the data you’re given is derived from statistics used in a🌳ll facets of a business𝔉.
- Statistics in investing include average trading volume, 52-week low, 52-week high, beta, and correlation between asset classes or securities.
- Statistics in economics include gross domestic product (GDP), unemployment, consumer pricing, inflation, and other economic growth metrics.
- Statistics in marketing include conversion rates, click-through rates, search quantities, and social media metrics.
- Statistics in accounting include liquidity, solvency, and profitability metrics across time.
- Statistics in information technology include bandwidth, network capabilities, and hardware logistics.
- Statistics in human resources include employee turnover, employee satisfaction, and average compensation relative to the market.
Why Is Statistics Important?
Statistics is used to conduct research, evaluate outcomes, develop critical thinking, 🏅and make informed decisions about a set of data. Statistics can be used to inquire about almost any field of study to investigate why things happen, when they occur, and whether reoccurrence is predictable.
What’s the Difference Between Descriptive and Inferential Statistics?
Descriptive statistics are used to describe or summarize the characteristics of a sample or data set such as a variable’s mean, sta﷽ndard deviation, or frequency. Inferential statistics employ any number of techniques to relate variables in a data set to each other. An example would be using correlation or regression analysis. These can then be used to estimate ꧟forecasts or infer causality.
Who Uses Statistics?
Statistics are used whenever data are collected and analyzed and used widely across an array of applications and professions.♛ These include government agencies, academic research, and investment analysis.
How Are Statistics Used in Economics and Finance?
Economists collect and look at all sorts of data ranging from consumer spending and housing starts to inflation and GDP growth. Analysts and investors collect data about companies, industries, ꦅsentiment, and market data on price and volume. The use of inferential statistics in these fields i💎s known as econometrics.
Several important financial models including the 澳洲幸运5官方开奖结果体彩网:capital asset pricing model (CAPM), modern portfolio theory (MPT) and the 澳洲幸运5官方开奖结果体彩网:Black-Scholes options pricing model rely🐓 on statistical inference.
The Bottom Line
Statistics is the practice of analyzing data and drawing inferences from the sample results. Statistics is used across a variety of fields from governmental agencies to finance to gather ꦦconclusions about a given data set.
The study of statistics can lead to a career as a statisticia💃n but it can also be a handy metric in everyday life. Statistics can be used to gain insights on probable outcomes of objects or events whether you’re analyzing the odds ﷽that your favorite team will win the Super Bowl before you place a bet, gauging the viability of an investment, or determining whether you’re being comparatively overcharged for a product or service.