Transcript of Stage 4 statistics and probability

Speaker 1: Welcome everyone, nice of you to join us today. We'll be looking at Stage 4 Statistics and Probability today. I'll go straight into the presentation and we'll get started.

Like I said, today we'll be looking at the syllabus content in Stage 4 Statistics and Probability. We'll be looking at the new content, such as identifying variables as categorical numerical data. It's not new to mathematics, but it's new to stage 4. We'll be looking at investigating data collection from primary and secondary sources, developing intercultural understanding, and ethical understanding capabilities. We'll be looking at single variable data representation and effective outliers. And we'll be looking at probability, Venn diagrams, and two-way tables. And again, a little bit more about critical and creative thinking, especially when problem solving.

The first thing I'd like to do is look at the 7 -10 Continuum of Key Ideas and look a the new content within the actual statistics and probability strand. When you look at it from Stage 4 all the way to Stage 5.3, you'll find that this is the one strand with the most content that's actually quite new. Some of it's moved down from Stage 5 to Stage 4. And then some of it, like that bivariate data analysis, some aspects are quite new altogether.

I know the typing is really quite small, but as you can see, we'll be identifying variables categorical and numerical, discrete or continuous. We'll look at collection and interpretation of data from primary and secondary sources, including surveys. That's part of data collection and representation in Stage 4. In Stage 4 single variable data analysis, we'll be investigating the effect of outliers on the mean and the median and look at calculating and comparing summary statistics of different samples drawn from the same population.

As you can see in the Continuum of Learning, we then move into single variable data analysis in Stage 5, which is, your construction and interpretation of stem and leaf plots, looking at skewed symmetrical data, bimodal data, interpreting and critically evaluating reports from the media and elsewhere that link claims to data displays and steps. That continues into comparing shapes of box and whisker plots and corresponding histograms and dot plots, and critically evaluating sources of data in media reports and elsewhere.

The second part of this is the bivariate data analysis which is quite different to Stage 5. We'll go into depth with that in the next session, so I'll just leave that one there. And there's the probability. Now, probability is basically what we've always done in terms of simple events, looking at constructing sample spaces, et cetera, complimentary events. The new thing her for us would be the Venn diagrams. So introducing Venn diagrams and looking at two-way tables and Venn diagrams together. The rest of the probability is the same in Stage 5 as well, except for the inclusion of Venn diagrams once again.

Okay. Let's have a look at the first outcome. Stage 4 Statistics and Probability. You've got investigate techniques for collecting data, including census, sampling, and observations. So you'll be defining the term variable in this stage. We'll be looking at categorical variables, numerical variables, discreet or continuous. We'll be identifying examples of categorical variables, such as colour, gender, discrete numerical variables and continuous numerical variables, example height, weight, etc. We're looking at data collected on a rating scale being categorical data. We'll also look at explaining the difference between the population and a sample, difference between collecting data by observation, census sampling, identify examples of variables for which data can be collected by observation or by census or by sample, and discuss the practicalities of collecting data through a census compared to a sample.

This is where some of the cross-curricular and learning across the curriculum sort of content comes in. You've got including limitations due to population size. Example, in countries such as China and India, a census is conducted only once per decade. So that's where all that sort of pops in and you can investigate those areas a little bit further with the students.

Students develop ethical understanding. I've sort of stopped and looked at ethical understanding because it comes up a fair bit in this area and so does Asia and Australia's engagement with Asia, as well as intercultural understanding. So we'll just stop and look at them closely for a minute and then we'll move on to the statistics stuff again.

All right. Students develop ethical understanding as they learn about and learn to act in accordance with ethical principles, values, integrity and regard for others. Within our new K-10 syllabus, some of the areas where students develop and apply ethical understanding is in collecting and displaying data, interpreting misleading graphs and displays, examining selective use of data by individuals and organisations, detecting and eliminating bias in the reporting of information. It's one of those areas where if you see those scales, which are up here in your syllabus, you really need to stop and think closely about that activities you're giving students and discuss [inaudible 00:05:34] ethical understanding within the context that you're teaching.

Asia and Australia's engagement with Asia also pops up here and it's this little A symbol here that you can see. In the syllabus, it states that the Asia and Australia's engagement with Asia priority provides are regional context for learning in all areas of the curriculum. An understanding of Asia underpins the capacity of Australian students to be active and informed citizens working together to build harmonious local, regional, and global communities and build Australia's social, intellectual, and creative capital. This priority is concerned with Asian literacy for all Australian students. Asia literacy develops knowledge skills and understanding about the histories, geographies, cultures, art, literature and language of the diverse countries of our region. It fosters social inclusion in the Australian community and enables students to communicate and engage with the peoples of Asia so that students can live, work, and learn effectively in the region. Where do we see it? Well in this case, we're going to see it in investigations involving data collection, representation, which can be used to examine issues in the Asian region.

Intercultural understanding is another one that pops up here a fair bit and you get this sort of little global icon that comes up. It involves students valuing their cultures and beliefs and those of others and engaging with people of diverse cultures in ways that recognise commonalities and differences, create connexions, and cultivate respect. It can be enhanced if students are exposed to a range of cultural traditions en masse and can be demonstrated in many areas. For example, through examining aboriginal and Torres Strain Islander people's perception of time weather patterns, and the networks embedded in family relationships as well as in activities such as examining patterns in art and design, learning about culturally specific calendar days, comparing currencies, and, in our case, showing awareness of cultural sensitivities when collecting data.

When we're talking about statistics and data, the first thing I think about is sport. Sport has a lot of statistics in it and it's a great example to use for students. Students should be aware of the relevance of data representation. So if you just think of the Premier League, in terms of students. They love soccer. In investigations it's important to develop knowledge and understanding of the ways in which relevant and sufficient data can be collected as well as implications and limitations of what constitute appropriate sources of data, both primary and secondary.

Data and statistics are used in many aspects of our everyday life. Data is collected to provide information on many topics of interest and to assist in making decisions regarding important issues. Example, projects aimed at improving or developing products and services. That data is important at every level. You've got users of data at the collection level, organisation, interpretation, and analysis. It is quite important that students actually understand that there are many layers of people who are using data, whether it's themselves, building their dream team or talking about their favourite team, or on a very professional level, where you've got the statistician sitting there in the coach's box, taking stats and tabulating it for the newspapers and the reports that come out on the next day.

Think about Rugby League, something a little bit closer to home. We have statisticians at every game for both teams. They sit in the coach's box, they collect data. The data is interpreted, displayed and analysed. The data is used for the half-time players talk. So in the players' sheds decisions are made, based on statistics collected about player performance and the collection of team performance. The data represents itself in many forms and many ways, NRL websites, newspapers, all over the different apps that kids look at in terms of rugby league scores and updates.

Each player's statistics determine how much a player is worth. Team statistics determine whether a team will make it to the grand final. It's a multi-million dollar industry that kids can relate to, based on player and team performance data and statistics. And it's a great one to go to. If you go to, you can also look up player stats and live stats and have heaps of meaningful conversations to really get kids thinking about statistics in the context of rugby league. You can get kids to look up their own favourite Rugby League team. Look at the stats that are released that week in the media or look at their own dream team that they've created.

But before I go further with Stage 4 Statistics and Probability, I'd like to step back into Stage 3 Statistics and Probability and I want you to have a look at the continuum of learning that connects to what we'll be doing in Stage 4. Now this is the new syllabus. This is what students will be doing in the future. Keeping in mind they don't start yet. They'll be starting roughly around 2015.

So if you can look here, we've got pose and refine questions to construct a survey to obtain categorical numerical data. So the terminology of categorical numerical data will be shown to students back in Stage 3, which is fantastic. Constructing, displaying, dot plots, column graphs, using technology and without technology, creating tables to collect data, numerical data, constructing column and line graphs, dot plots, and recognising that line graphs are used to represent data that demonstrates continuous change. All this occurs at Stage 3 level.

Now, we move on. Recognise which type of data displays the most appropriate to represent categorical data, describe and interpret different data sets. This is all data 1. There's also data part 2, which is down on the bottom here. So data part 2 is basically what year 6 tend to see. So that will be interpreting and comparing a range of data displays, including side by side column graphs to 2 categorical variables. They'll create two-way tables to organise data involving 2 categorical variables, interpret side-by-side column graphs, and interpret and compare different displays of the same data to determine the most appropriate display for the data set. So it's quite sophisticated and it's good to see in Stage 3, so that when they come to Stage 4, they've got a bit of background knowledge happening there.

They'll also look at interpreting secondary data, presented in digital media and elsewhere, critically evaluate data representations found in digital media, discuss the messages that those who create a particular data representation might want to convey, identify sources of possible bias in representation of data in the media, and identify misleading or misrepresentation of data in the media as well.

Our only problem is in the first couple of years is that there will be gaps in knowledge existing in between the Stage 3 and Stage 4 continuum of learning because we ...

Considerations for prior knowledge is that in 2014 primary teachers start programming. We will be implementing in 2014 the Stage 4 content. So the students coming in in year 7 only have knowledge of the previous syllabus. In 2015, it's a little bit different. Primary teachers will begin implementing the new syllabus for all stages. What that means is that students in year 6 will have some knowledge of the new syllabus Stage 3 outcomes. It won't be until 2016 when students in year 6 will have access to all Stage 3 outcomes of the new syllabus and the content that you just saw.

So, what do your kids in Stage 3 actually come to you with in terms of next years, year 7? In the current syllabus, what they do in data is actually display and interpret data, find the mean of a set of data, draw a pictograph, interpret graphs using the scale, draw a line graph, draw a divided-by graph and interpret it. So basically what they're doing is displaying and interpreting data at this point.

We have to keep in mind that if we're staring the teach our Stage 4 content, we might have to step back a little bit and make sure we unpack the terminology with them and be very clear that they're on the same wavelength with us.

In Stage 4, what will we be doing? Start with a variable. What is a variable? This comes from the glossary within your syllabus right at the back. Define a variable as something measurable or observable that is expected to change either over time or between individual observations. Examples of variables, age of students, hair colour, country of birth, shoe size.

Types of variables. Going into the fact that you have numerical and categorical variable. You have continuous and discrete variable. They're the numerical ones. Now, ordinal or nominal you can go into if you wish. Just keeping in mind that it is not compulsory at this stage. But it's helpful for students to have. So that's something you can keep in mind or you can just keep it at numerical and categorical, continuous and discrete and that's fine as well.

Classifying variables. You want to make sure kids understand the difference between the different categories and how to classify your variables. You've got your discrete being usually a whole number, a number count, shoe size, number of cars, etc. Continuous- usually a measurement. Nominal is name data. And Ordinal are adjectives that describe the numerical position of a subject.

Categorical Variables is a variable whose values are a category. Blood group is a categorical variable. Construction type, data collected on a rating scale, even postcodes. Because the numerals here have no numerical significance postcodes are classified as Categorical Variables. You've got colour, gender, etc.

Numerical Variables are variables whose values are numbers. A discrete numerical variable is a variable where there is a defined gap. So number of children number of runs, number of cars, shoe sizes, 7, 7 and a half, 8 and a half. That's all discrete numerical variables. Prices in dollars and cents is also classified as discrete. Continuous numerical variables are all measurements.

There's a lovely little clip here which you'll find on the building capacity resources. I've got the link in here, but I won't go to it today. It just has a little clip and it shows the kids the difference between numerical categorical variables, discrete and continuous. So feel free to go in and have a look at that. It's nice for your teaching units that you're going to be developing.

Taking kids through this. State where the data collected below are classified as categorical variables or continuous numerical variables or discrete numerical variables. And throw all the different kinds of ideas, height of people, shoe sizes, ice cream flavours, postcodes, months of the year, ratings, students' favourite pets.

The next thing we look at is when we collect data for samples. There's three ways of collecting data. We have sample, census, or observation. This all comes from your syllabus. A sample is a part of a population. It is a subset of the population, often randomly selected for the purpose of estimating the value of a characteristic of the population of the whole. A population is the complete set of individuals that we want information about. And a census is an attempt to collect information about the whole population.

For example, a randomly selected group of 8-year old children, the sample, might be selected to estimate the incidence of tooth decay in 8-year old children in Australia. That's your population. So collecting data by observation could be direction travelled by vehicles arriving at an intersection, type of native animals in a local area. Collecting data by a census or sample examples, a census to collect data about the income or the education of Australians, a sample for TV ratings, a sample for favourite sports. It's also important that we discuss the practicalities of collecting data through a census compared to a sample, including limitations due to population size, like we said before.

The next part of this outcome, collects, represents, and interpreting single data sets, using statistical displays involves exploring the practicalities and implications of obtaining data through sampling using a variety of investigative processes. You've got random process of collecting data, identifying issues that make it difficult to obtain data from either primary or secondary sources, discuss constraints that may limit the collection of data or result in unreliable data. Investigate and question the selection of data used to support a particular viewpoint.

This is where this building capacity resource comes in really handy. It's on the DEC website, where you go to the Australian Curriculum Resources and the live link is there, which you can download from the conclusion page at the end of this presentation. It's got some great little learning activities, videos that the kids can watch in class. There's a really nice one on primary and secondary data sources, so make sure you have a look at that one. And it goes through and there's some nice lesson plans there for your to follow.

The next part of this outcome looks at identifying and investigating issues involving numerical data collected from primary and secondary sources. We look at bias. We look at effective different sample sizes, how a random sample may be selected in order to collect data, detect and discuss bias and construct appropriate survey questions, and related recording sheet in order to collect both numerical and categorical data about a matter of interest.

It brings us to this. Describe in practical terms how a random sample may be selected. A simple random sample, every member of the population has an equal chance of being selected. For example, a lecturer delivers a lecture to 200 people. You need the names of all the students in no particular order, then you select the names at random and the number you that you need for the sample. Or you could use a random number generator and then select all the names with that number on the list.

The next way, stratified random sample. This is used when the population contains different characteristics. So for example, you may have 400 students in the school, 100 of which are girls, 300 are boys. Using a random sample, you may include all the girls and not enough boys. So the sample size will not be proportionate to the number of girls and boys in the population. The information in this case that you collect does not accurately represent the population.

Divide your sample number into the sample population ratio of girls to boys and randomly select the number of girls and boys in this example. 1 is to 3 in this case. If their sample is 100 students, you need a quarter girls, three-quarter boys, which is 25 girls and 75 boys. In which case, your sample now accurately represents your population. Going through this with the kids is really important.

Your cluster sampling divides your population into groups and a simple random selection of these groups is made. Then survey everyone within the selected group.

Construct a recording sheet that allows efficient collection. Decide whether a census or sample is more appropriate. Collect and interpret information from secondary sources. Interpret and use scales on graphs. So including those where abbreviated measurements are used. If there's 50 on a vertical axis representing thousands, it's interpreted as 50,000. So give kids these types of examples to interpret as well. Analyse a variety of data displays used in the print or digital media. Identify features on graphical displays that may mislead. Misleading units of measurement, wrong labels on the axes, incorrect interpretation, not starting at 0, etc. Use spreadsheets or statistical software packages to tabulate and graph data, and discuss ethical issues that may arise form collecting and representing certain data. So this is all part of what we need to do.

In terms of construction and comparing a range of data displays, including stem and leaf plots, this is what the syllabus asks you to do. So frequency distribution table, the kids have to use a tally to organise the data into a frequency distribution table. Frequency histograms, students are expected to construct and interpret frequency histograms, as well as select and use the appropriate scales and labels on a horizontal and vertical axis and recognise why a half column with space is necessary between the vertical axis and the first column of the histogram, which is down here. For the frequency polygon, they're expected to construct and interpret the frequency polygon, select and use the appropriate axis once again and come up with their diagrams, as usual.

In terms of the other graphs, you can see here that they're still expected to be able to construct all the above graphs, so dot plots, line graphs, sector graphs, divided by graph, ordered stem and leaf plots with 2 digit stem. They'll be able to interpret all these graphs. With the sector graph, they need to be able to calculate the angle of the centre, required for each sector of the sector graph. And for the divided-by graph, calculate the length of the bar required for each section of the divided-by graph as well as calculate the percentage of the whole, represented by categories in a divided-by graph.

As a collective look, they need to be able to compare the strength and weaknesses of different forms of data display and know which one is more appropriate for which type of data. They need to identify and explain which graphs are suitable for numerical data or categorical data and draw conclusions from data displayed in a graph.

They need to be able to investigate the effect of individual data values, including outliers on the mean and the median. Probably one of the best ways to do this is to give them an example like this. Make sure they understand what an outlier is. Get them to calculate the mean the mode and the median, with all the data points. Once they've created their dot plot and they've calculated all these, remove the outlier, exclude the outlier and then make your calculations. Then ask the kids what effect did the outlier have on the mean, the mode, and the median. And why is it more appropriate to use the median than the mean when the data contains one or more outlier. This is really important and worth showing. You don't need more than a slide or two to make this point. But it is quite powerful.

Again, analysing house prices in a particular suburb, which data would be most useful? Mean, mode, or median? What would you use and why? And what do the kids see in the newspapers? Get them to open up the newspaper and have a look at the weekly sales, the weekend sales of houses and see how that is reported on in the media.

A salesperson ordering shoes for the store. You're analysing the data of shoes being purchased, which statistical data analysis would be more useful? The mean, the mode, or the median? And why?

Critical and creative thinking again appears a lot in this strand. Here's the formal definition for it. It's on page 41 of your syllabus. Students use critical and creative thinking in activities such as comparing actual to expected results, setting up statistical investigation, approximating, interpreting, estimating, examining misleading data, etc. So there's quite a scope in here to use this in terms of this topic.

Activities that promote critical and creative thinking integrate logic, solutions, innovation, communication, reflection, reason. It's really important that you give the chance to the kids to actually engage in critical and creative thinking properly. By sharing, thinking, visualising, and innovation and by giving and receiving effective feedback, students learn to value the diversity of learning and communication styles. This is very important to critical and creative thinking as a communicative process. They develop precision, flexibility. They know how to weigh out the evidence and they come up with some amazing things if you just give them the opportunity.

What kind of activities are we thinking about here and how do students learn to become or develop into critical and creative thinkers? Students learn to pose insightful and purposeful questions, apply logic and strategies to uncover meaning and make reasoned judgments, think beyond the immediate situation to consider the big picture before focusing on the detail, suspend judgement  about a situation to consider alternative pathways, reflect on thinking, actions, and processes, generate and develop ideas and possibilities, evaluate ideas and create solutions and draw conclusions, and transfer their knowledge to new situations. All this is all part of critical and creative thinking.

This takes us to syllabus bites  and syllabus bites there's some great little learning tools for Venn diagrams. Venn diagrams comes up in your probabilities sort of section. It is basically what you've seen before. The one thing to note is that the students are expected to interpret and draw Venn diagrams with 2 attributes, but only interpret Venn diagrams with 3 attributes. They don't have to draw them. So keep that in mind. They do construct two-way tables and look at the attributes in the two-way tables as well and connect the relationships between Venn diagrams and two-way tables.

You'll see here that you'll look at mutually exclusive attributes in Venn diagrams. The language here is really important for you to unpack for the students. I think it's absolutely vital. We look at non-mutually exclusive attributes and going through and, all, neither, not, and what that actually means. So there's 25 students who play both basketball and football, 46 students who play basketball or football, but not both, 19 students who play neither sport, and 71 students who play basketball or football or both.

Again, the relationship of and, or, inclusive or, neither, not. Looking at the two-way tables and making sure kids actually understand that I think is really, really important. There are 63 male right-handed students, 63 students are neither female nor left-handed and there are 114 students who are male, right-handed or both.

In the syllabus bites you'll find these diagrams, which show you all the different versions, intersection, union, soccer and volleyball being mutually exclusive, and of course, representing students who play soccer, volleyball and both sports.

In terms of the three attributes, they only need to be able to interpret them. Again, looking at soccer and volleyball, the students who play soccer, the students who play volleyball or both. Looking at the intersection of all three, only the students who play the three sports. Looking at the intersection of two sports. And looking at the intersection of two, but disregarding the students who play hockey in this case. These diagrams will be very helpful when you're teaching.

These are the four syllabus bites and these are the links that take you there. There's introducing Venn diagrams, more Venn diagrams, language of Venn diagrams, using Venn diagrams to solve problems, and, of course, Venn diagrams and two-way tables.

This is just little snippets so you can see what happens. Some of the activities are really quite fun. There's sorting X-Factor contestants and looking at actually who their mentor was. If they can't remember the X-Factor contestants of that year, they go into Wikipedia and they can look up the whole results from the X-Factor finalists competition and they sort them into the different mentors. It's quite nice this activity. I think the kids will enjoy it. It goes on a little bit.

There's a syllabus bite for Will's Words. Words starting with S, words ending with E, words containing M, and then you can see here, there's one word that starts with S, ends with E and contains the letter M. The kids have to type in the words, et cetera and it marks it for them.

Mike's Eight Positive Integers. Again, another example and these you'll find all on the syllabus bytes. Heaps of student activities there, sorting integers by multiples. More Venn Diagrams takes you into further areas and further attributes. Not just two, but three. There's the language of Venn diagrams and looking at all the different language and how they're represented. There's solving problems using Venn diagrams and the final one, Venn diagrams and two-way tables are really nice to look at as well.

Whether you use it as a teaching tool or if the kids go home and have a play, it's fantastic. There's Facebook accounts, and looking at two-way tables, right-handed, left-handedness, and looking at two-way tables from there and Venn diagrams. Then combining the two. As the kids enter the data, the software actually tells them whether or not they're correct, depending on the numbers that they get.

Then there's some extension work, which is set theory right at the end as well. If you've got an accelerated class or a gifted and talented class and you want to go a little bit further, you can go into intersection of two sets, subset of a set, union of two sets, and compliment of a set as well.

Be sure to go into the syllabus bites documents and be sure to go into the building capacity resource for Stage 4 statistics, which you will find on the DEC website. It's got some great stuff in there as well. You can download all of these units and the PowerPoint that I presented today on the conclusion page, which you will find right near the end and will also be on the recording.

Return to top of page Back to top