![]() |
|
#1
|
||||
|
||||
|
There is much talk about streaky players, with one example being Rick Roos' story posted today. This put me in mind of a study a few years ago showing that the sense that basketball shooters get "hot" was a mis-perception. Things that seem exceptional to us seem to happen more frequently than they really do. In addition, it is easy to forget that clustering or streaking of events is to be expected. We know that flipping a coin heads five times in a row is unusual, with a one in 32 chance it will occur if we pick up a coin and flip it five times. However if we flip a coin 100 times, we should expect there to be a streak of five heads in a row; there is about an 80% chance there will be such a streak somewhere in the 100.
Similarly, we intuitively have an excellent sense for the fact that a point per game player will not score exactly one point in each game, but we don't necessarily have a quantitative sense of how much deviation there shoud be from that even pattern. In fact, for a ppg player, there is about a 37% chance that he will go pointless in any particular game due strictly to random fluctuation and not to any unusual characteristics of that player's performance. To get a feel for the streakiness of point scoring in hockey, I collected the set of all forwards who had at least five multi-point games in both the 2010-2011 season and the 2011-2012 seasons, and calculated the expected number of multi-point games for each using his number of games played, his average number of points per game, and the Poisson distribution. I then divided the population for each year into performances where more multi-point games than expected occurred and those with fewer. Each player thus either had more than expected the first year and more the next, fewer and then fewer, more then fewer, or fewer then more. The first two types are what we would expect if the player's actual tendencies made him either more streaky or more consistent, respectively. The latter two do not lend any support to that argument. If scoring more or fewer points than expected happened randomly in any given year, we would expect about 25% of players to fall into each of the four categories above (ie more/fewer, fewer/more, etc.) Of the 147 players considered, the data look like this: + - Total + 40 41 81 - 40 26 66 Total 80 67 147 The p-value for this result is greater than 0.99 (from McNemar's test), meaning that in a truly random situation where there was no "real" streakiness characteristic for each player, a result at least this deviant from the perfect 25% of each type would happen more than 99% of the time. In other words, there is no support at all among these data that having more or fewer multi-point games in a season is due to anything but chance.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) |
|
#2
|
||||
|
||||
|
Quote:
Mercy - I love your statistical crunching. This one lost me for a bit... so it is hard to agree or disagree. I love the part about 5 coin flips. I've referenced it myself when people suggest there are cold or hot starters in the NHL. In a population of 500 skaters, there will always be a few that have 4 or 5 hot starts or cool starts... and this is just by the numbers - but there's no guarantee that an upcoming season will also be hot or cold. So I like this part & agree. I read the last (quoted) paragraph over a few times and it still isn't exactly clear. If I actually saw a visual of you grouping & analyzing - it would be clearer. I'd probably like to see this sorted out by age group too. i) As I think there are older players that have had anomaly-type increasing multi-point-game seasons (from say, Season X to Season Y) that will decline (in Season Z) naturally, since Season Y was an anomaly. ii) And then there are younger player that had increasing multi-point-game seasons (from say, Season X to Season Y... and then increase again in Season Z). But it might be interesting to see if there is some balance of anomaly-seasons with maturation-seasons that cancels any "trend" that might be there to suggest something. That was probably really unclear. Nice dig though! (some rep too) |
|
#3
|
||||
|
||||
|
One possible problem with the above analysis is that maybe just a few players are streaky If so, it might be possible to lose the evidence of their streakiness because it is mixed in among the many players who are not so. To check for this, I threw out all players who did not differ in both years from the expected number of multi-point games by at least one. This left 53 players:
+ - Total + 21 13 34 - 13 6 19 Total 34 19 53 A little more unusual, assuming randomness, but this still gives a p-value of 0.84. That means that even among those with more substantial deviations from expected scoring in any one season, the proportion of the time that they are consistent in the direction of the streakiness is still not more than would be expected due to chance.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) |
|
#4
|
||||
|
||||
|
Quote:
Take Marian Gaborik. In 2012, he had 22 multi-point games. Scoring at a rate of 0.93 ppg, we would expect him to have 39% scoreless games and 37% one goal games, leaving 24% multi-pointers. Over his 82 game season (!) that would be 19.5 games. Therefore he was streaky last year, using the definition of having more multi-point games (which I stole/adapted from Mr. Roos' article). But in 2011, he had 0.77 ppg, which over a 62 game season (that's the Band-Aid boy we know and love) would predict 11.2 multi-point games, while in fact he was more consistent with only 7 multi-point games. Therefore he was in the "more then less" category. That's not what we would expect if Gaborik were actually a streaky player who consistently bunched his points together. Sure, he might actually be streaky at heart, and this pair of years may have just happened by chance, but that's what the test is for - to see how likely it is that that's the case.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) |
|
#5
|
||||
|
||||
|
Quote:
You seem to already understand the above given your last point that the balancing of anomalies would in fact be what we would expect if there were no true trend. I think your age comments stem from the thought that if you score more points, you should have more multi-point games, no? So as young players mature, they should naturally increase their multi-point games. True, but the anomaly doesn't necessarily have to change. The effect of more multi-point games due simply to squeezing more points into a fixed number of games is already accounted for using the point per game values and the Poisson distribution.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) |
|
#6
|
||||
|
||||
|
Quote:
I do believe there are patterns out there w.r.t. streaks. But I think a sample size that is actually too big might hide some telling numbers. That is my suggestion - to try the same test... but to separate players by age, perhaps even by draft position. That's all. Just trying to open some more avenues of focused-analysis. Test sub-groups. |
|
#7
|
||||
|
||||
|
Sorry, I tend to babble that way. Turns out my mathematical gum-flapping may have missed your point anyway
Quote:
Quote:
But two caveats: first, with subgroups the sample size will be smaller, and so we would be less confident in any effect we think we see. Second, and more importantly, great care is needed when doing post hoc analysis of subgroups looking for trends. If I choose my sub-groups carefully enough, I can find almost any trend I like in any population: eg if I eliminate all people in a sample I think are hot, trying to avoid bias, I might find that the average number of balls per person is 1.5. I know that fiddling with the parameters until an effect is found is not what you meant, but it is part of why I haven't already looked in sub-groups - I don't trust myself.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) Last edited by Hey Robbie; 01-31-2013 at 10:33 AM. |
|
#8
|
||||
|
||||
|
Well, I was thinking over the way I had analysed the data, and I have a few problems I'd like to address. First, there are the issues noted above about whether most players should be included in the analysis, or whether that might lose the signal of truly streaky and extra-consistent players amid the mass of players in the middle, and thus if analyzing sub-groups would be better.
Another problem occurred to me though: The way I set up the analysis and the test I used didn't technically check if there was evidence of individual players' performances differing between 2010-2011 and 2011-2012, but rather whether there was some overall difference between the two seasons. I think I've figured out a better way to approach it. This time I looked at the last four seasons and found the set of all players who had at least four multi-point games in each of those seasons. There were 24 such players. For each I figured out the expected number of multi-point games expected in each season as before, and compared the actual number to the expected number. For each season a player was either above (+) or below (-) the expected number of multi-point games. Thus for each player I had a series of four values. For example, Claude Giroux was ++++, meaning he had more multi-point games than expected in all four years, while Anze Kopitar was +---, meaning he was above expected in 2008-2009, but below in the next three seasons. Giroux's ++++ is what we would expect from a player who was very streaky, in the sense of clumping many points into a few games. However that doesn't mean that Giroux actually is streak. Even if the "+"s and "-"s really are just random, in looking at 24 players we might expect a few players to happen to get ++++ anyway. In fact, under random conditions, we would expect that to happen about 6.25% of the time. Under random conditions, we might think it more likely to be over expectations half the time and under the other half, and in fact this should be much more common: about 38% of the time. Here are all of the expected percentages: Result..........Expected % 4 +.............0.06250 3 + and 1-......0.25 2 + and 2 - ....0.38 3 - and 1 + ....0.25 4 -.............0.06250 Things didn't come out exactly as these percentages would predict. The real numbers were: Result........Number.....% 4 +.............5......12.5% 3 + and 1-......7......29.2% 2 + and 2 - ....8......33.3% 3 - and 1 +.....2......8.33% 4 -.............2......8.33% Last, we just need to see if those numbers are different enough from the expected percentages that we think something besides random chance is going on. A chi-squared test for this gives a p-value of 0.92, which means that things would be at least this different from the expected percentages 92% of the time. In other words, there is nothing in the numbers for these 24 players that suggests that anything but random fluctuation is going on, with some years getting more multi-point games and some years getting fewer just by luck. In fact the p-value is even more convincing than usual, as by picking only players who had a certain number of multi-point games I probably skewed the distribution toward the "++++" side, which should make the chi squared test return a lower p-value.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) Last edited by Hey Robbie; 02-01-2013 at 10:37 AM. |
|
#9
|
||||
|
||||
|
This is interesting work Robbie
I don't really measure streakiness in this way, but rather define it as the number of zero point games in a row (so I kinda work in reverse of your analysis), but still allowing for the occasional single point. I believe those cold streaks exist, but would would be exhausting to measure. And when a cold streak ends, regular scoring begins (what we call a hot streak, but is in fact what he should be doing in the first place) So a player on a 'hot streak' isn't measured by his multi-point games...but rather in how all the zero-point games stop happening. I don't know if you're up for tackling it from this angle. But players go on clusters of cold and hot runs, some more than others. Huselius was classic for this. Vrbata is another one to look at. Kessel...tons of examples out there. A player is just snakebitten. He grips his stick tighter and keeps missing the mark. Once a couple go in, the confidence returns. And then it becomes what we call a hot streak, but in actuality is more of a "what he should be doing in the first place".
__________________
DobberHockey Rules 14 Team Keeper, points only, best 12 skaters 2 dman 2 G count. Playoffs count. F - Crosby, C.Smith, Wheeler, Parenteau, Hudler, Clowe, Grabovski, Atkinson, Frolik, S. Kostitsyn, Marchand, Peverley, Tavares, Read, Brouwer, Bickell, Gionta, Setoguchi G - Anderson, Hiller, Markstrom, Brodeur D - Letang, Del Zotto, I. White, Kronwall, Brodin, Nikitin |
|
#10
|
||||
|
||||
|
Thanks for the feedback, Dobber. I agree with you that there are real mechanisms that really cause streakiness, but I also think that many people overestimate either the amount of streakiness of any one player or the number of players who experience such streakiness, either regularly in their careers or currently experiencing cold streaks at any one time. That's one of the big problems, I think, of trying to analyze this phenomenon on a broad, statistical scale: if only a small subset of players really experience such streakiness at an appreciable level, the effect could be lost in the wash.
I also wouldn't normally define streakiness using multipoint games. I chose to take that tack because it was how Rick Roos had discussed it in his article from last week (also it was a relatively easy set of data to grab). I think your definition of streakiness using consecutive blanks is a good one, and certainly a lot easier to figure out than my idea I had been toying with of something like a rolling average of points per week and then looking at the standard deviation! The only thing I'll have to ponder is how to think about the length of cold streaks, because if we just count, say, three or more zero point games in a row as a cold streak, it might a player who scores in three, blanks for three, scores in one, blanks for three, and then scores in three look twice as streaky (because he has two cold streaks) as someone who goes six games without scoring and then scores in seven in a row. The latter seems streakier to me, not half as streaky. If I spend a little less time on my research for the "where are the good beer drinkers?" thread I will probably revisit this soon.
__________________
C: Tomá Plecanec (10), Jonathan Huberdeau (11) LW: Patrik Eliá (14), Pascal Dupuis (10) Иlлья Kovalchuk (1), Brandon Saad (10) RW: Jarome Iginla (3), Phil Kessel (2) Jakub Voráček (9) Util: D:Ryan Suter (7), Mike Green (4), Brent Burns (10), Mark Streit, Justin Schultz (12) G:Craig Anderson (6), Дeвaн Dubnyk (17) Cepгeй Bobrovsky (9), Niklas Bäckström (8) Last edited by Hey Robbie; 02-04-2013 at 05:44 PM. |
![]() |
«
Previous Thread
|
Next Thread
»
| Thread Tools | |
| Display Modes | |
|
|
All times are GMT -4. The time now is 07:06 PM.













Linear Mode

