USATSI 7674935 154511096 lowres

 

Don't ignore other evidence when using statistics in fantasy evaluations.

 

Today I am going to detail my arguments against The Power of Numbers, an article written by Ryan Ma. After this, I promise to drop the matter.

Before I step into my arguments, there are some things from last week's article that need clarification, as they weren't intended to offend.

"Pulling wool over eyes" … is intended towards the great deal of emphasis that Ryan places on the covariance coefficient value and what he calls the most important three factors. Do you need a value to tell us that a player has more opportunity to score if he a) gets more power play time, b) they shoot more and c) they have more ice time? I underline the word opportunity because I want there to be a clear distinction that there is no guarantee for point production whereas I get the feeling from Ryan’s article that due to normalization the points will eventually be generated. Not only do I think the covariance coefficient value is not needed I believe that it is skewed.

"Gambler’s Fallacy"… it is not my idea. A link to the wiki page can be found here. We do not get to decide when, if at all, normalization will occur.

"Brain-dead logic"… I apply it to all of us, including myself, when we fall into the Gambler’s Fallacy. I repeat that I am not immune.

"Graphs" … I appreciate that Ryan attempted to place Points on the same axis for all the graphs but due to excel issues was not able to do.

Now onto my case, for consistency I will be using the data found within Ryan’s pdf file for forwards.

First, I call upon Alexander Ovechkin.

 

Picture 1

 

Those are huge stats with 11476 seconds of power play time and over 200 shots, the best of all the forwards.

Do we recall what Ryan says? The strongest correlation co-efficient value is the one for PPTOI. The second strongest was shots on goal.

We are looking at the best player in both those categories. So where would you expect him to rank amongst all other forwards in terms of point production? First? Second? Go ahead and take a look at the pdf file.

He didn’t even crack the top ten forwards! He is 19 points behind the leader, Sidney Crosby.

Ryan doesn’t try to understand why Ovechkin didn’t rank as the best player after all those games, he just holds onto normalization and says that he is a good buy candidate. Understanding what went wrong is not part of the equation.

With half a season gone, those huge correlation co-efficient values would not be making me feel very happy if I had drafted Ovechkin early in a draft this season.

Next, I call on his teammate Nicklas Backstrom.

Picture 3

Backstrom, with a significant fewer shot total by more than half, with 2,000 seconds less of power play time but approximately the same amount of playing time has more points than Ovechkin.

Oh Demetri, you are such a nitpicker. Four points difference. What is your point?

The two biggest factors as per Ryan’s article do not seem to be the biggest factors or else we would have seen Ovechkin with significantly more points. I specifically look at Backstrom because he plays on the same team as Ovechkin and he gets only 70.9% of the team’s power play time. Ovechkin gets 94.3%.

Backstrom does make the list of top five forwards though.

I now call upon Logan Couture.

Picture 4

Power play time and overall time on ice are comparable to Backstrom’s stats. His shot total is higher by 51 shots. How is it that he has 13 fewer points than Backstrom?

Couture was ranked 43rd among all forwards at the time. Why isn’t he a buy candidate or conversely why isn’t Backstrom a sell candidate? (Yes, I am aware that Couture is hurt at the moment. The question is rhetorical)

Powerful stuff these numbers are indeed.

Dustin Brown, please come forward.

Picture 5

I bring Brown up (ranked 209th among all forwards) to look at what Ryan does not - why his first half production did not match his previous career levels. To me, Ryan seems disinterested in finding out why he started out so badly this year but very interested to say that his statistical numbers look good and he should get a good bump in the second half.

A quick scan of his news within The Hockey News site under his player profile reveals the following:

On September 13, 2013, Dustin Brown suffers minor hamstring injury, LA Kings Insider

On September 23, 2013, Dustin Brown did not participate in practice Sunday due to a hamstring injury, LAKings.com

Same day, Dustin Brown may not start season on time, Lisa Dillman on Twitter

Under Impact, “He hasn’t played any exhibition games and he’s [been] in one team practice since the start of camp. Coach Darryl Sutter isn’t sure if Brown will be available for the regular season opener. “If he doesn’t get any exhibition games, I’d say it’s highly unlikely that you’ve got a player that’s ready for the season,” said Sutter.

September 28, 2013, Dustin Brown set to play in preseason finale, LA Kings Insider

Under Impact, “It’s good. It felt pretty good for a week or so now, and getting into the on-ice stuff,” Brown said referring to the ailment “[Saturday], I think, will be good just to get hit and hit, just get the body ready for the next 82 games. The only thing I feel is probably the bruising is still there, but it’s just like any other bruise. It’s going to hurt until it’s gone. But the actual muscle feels good.”

October 1, 2013, Dustin Brown is rusty going into 2013-14, Los Angeles Times

Under Impact, “My conditioning, the physical part is OK,” said Brown. “It was the little puck plays, a couple plays where I mishandled it or didn’t make a touch pass I can make nine times out of 10. I’ve missed training camp and it shows.”

October 7, 2013, Dustin Brow dusting off the rust in early on, Los Angeles Times

“I felt a lot better in Winnipeg that I did in Minnesota,” Brown said. “I’m sure I’ll feel a lot better getting a couple of days’ rest. … I’m sure I’ll be right where I need to be, just getting in games. Practices are one thing, but it’s the timing of games. For me I was happy we had back-to-back.” Brown is also wearing a knee brace due to the injury suffered in last season’s playoffs.

December 1, 2013, Dustin Brown continues to struggle, LAKings.com

Under Impact, Brown has contributed with a plus-10 rating and 22 penalty minutes in 27 games but fantasy owners expect a lot more than four goals and nine points at this stage of the season. He is a good buy-low candidate at this time.

January 4, 2014, Dustin Brown unhappy with his play, LAKings.com

Under Impact, “If I can pick my game up and elevate my game – I’m sure there’s other guys who feel the same way – our team is going to get better,” Brown said. “For me, it’s just about getting my game back, and a lot of it’s just details for me. So I’ve got to focus on those from game to game and be ready to go.”

Looks like he is still playing hurt or at a bare minimum he is finding it hard to keep up with everyone else. When Drew Doughty missed all of training camp holding out for a new contract he went through a bad season too.

Not a statistic but it is information. Information that can be useful in forecasting his point totals for the rest of the season.

I also want to point out that on December 1, 2013, he was listed as a good buy-low candidate by LAKings.com. Since that point in time to the time captured in Ryan’s data file, he got four points in 13 games. A 0.307 points per game pace during that span of time.

I’ve got one more individual player to bring up but right now I want to call on all the forwards that have been plotted and evaluated.

I did this by hand, by my count there are 514 forwards. The average player got 12.7 points, the median point was 10 points and the mode was zero. Meaning more forwards got a value of zero than any other value.

With more than half the players getting 10 or fewer points, I suggest that the correlation co-efficient value is skewed to be so huge because the majority of players a) do not get many points, b) do not get much time on the power play and c) do not get as many shots.

That was my point about the values being denser at the lower end of the scales and sparser at the higher end.

For my final player I call upon Scott Gomez. For this I am using his career stats including his current statistics.

Picture 6

*Dobber's projections were obviously done prior to the given season starting

I bring him up to compare apples to apples because I think some of you won’t like my earlier comparisons of Ovechkin, Backstrom and Couture. So I am comparing Gomez to Gomez. (I do not have power play time on ice stats)

He burst on the scene in the 1999-00 season and got a smashing 70 points. With similar TOI and shot totals to his rookie year, in the 2002-03 season, he earned 55 points. I ask, If Ryan says that two of the most important indicators are TOI and shots, why the difference of 15 points? In 2005-06, Gomez's TOI and shots increase and he has the best year of his career, tallying 84 points. Hey maybe Ryan has something here.

So why does he not reach that same point level in later seasons, especially in 2008-09 when he earned the most ice time of all his season and also the most shots? He comes short by 26 points. So why such a fluctuation in his totals? The statistics were supposed to normalize. Are the numbers lying? Gomez may be an exception to the analysis, but he's one of many.

There are more than three key factors to point production and even Ryan admits to this in one of his responses to my original column. Under the response titled “Beene Counters” he says, “There are a multitude of factors that play into scoring and production.”

That is how I rest my case but before I end this article I will briefly address a secondary item, the topic of what I think could improve his system.

Ryan has taken much time and effort to come up with something. I appreciate that and from what I have read there are many others who enjoy his efforts in that regard. My efforts in these two articles were to point out the weaknesses (at least those perceived by me). I saw “Power of Numbers” and “most important factors” and felt it was incomplete. Ryan is absolutely correct in wondering why I did not offer up a better suggestion. So I will do so now. It involves augmenting his method with the following.

First, find a way to get rid of the lower-end players and not stress so much about the co-efficient value. Second, see if you can break your evaluations into top tiers like Austin Wallace did with his fantasy rankings. Compare the elite to the elite, and so on. Lastly, seek to find out why things didn’t work out in the first place.

Words and numbers, living together in harmony. Imagine that.


Write comment
Comments (21)add comment

Ryan Ma said:

Maaaasquito
Ovechkin hmmm 4 points last night, 7 in the last 3 and 14 in 11 since I released my 2nd half thread on the forums...

He's now 7th in league scoring, and really if you factor in the games where he missed because of his injuries, he'd actually be #5.

It just shows you how fickle numbers really are... A couple of good games and boom numbers could begin to normalize just like that.
January 29, 2014
Votes: +0

Kofax said:

Kofax
... A number of points here. habs7097 smoked it, said exactly what I was thinking but couldn't put into words.

Demetrios is picking out individuals and showing how they don't fit the average data produced by Ryan, but that is to be expected in every large data set and really isn't something new. You never take a data set and apply it to a T, because there are always outliers. You use the data to help you make a better decision when comparing all outside factors. If you take player X, and the data suggests he is overperforming, you look at other issues; is there an injury causing him to gain more ice time? Is he playing on a more talented team that is on a hot streak? Who knos what he'll fine. If there is not anything that sticks out, maybe you look for player Y who is underperforming and there do not seem to be any outside factors such as injury causing this, and you make a trade hoping for both to normalize. To me this process is common sense.

The above being said, it has been an interesting back and forth, and if nothing else has caused Ryan to come forth with even more analysis of his numbers, which has been helpful.
January 28, 2014
Votes: +0

austeane said:

austeane
Back and forth Steve, I think a back and forth is an amazing idea.. Especially if there is a continuing, respectful disagreement. You could take inspiration from this series of columns http://grantland.com/features/...simmons-v/ (but based on disagreement).
Even if you do come to an agreement eventually, the earlier emails would show your sides and it might actually be more enlightening if one of the correspondents is genuinely convinced by the other.

The debate could be numbers vs. words, the upside of a player, whether a player is a bust yet, etc.

I wonder if Angus might be willing to do one of these with Dobber every month. That would be great.
January 27, 2014
Votes: +0

metaldude26 said:

metaldude26
... horrorfan - a very good point. I know in the past I've done some of this "banter" with other writers on the site where I've disagreed, posted in the comments and then realized that the stuff was so good I needed to carry it over to an article of it's own be it when I was penning Cage Match or now with the ramblings. There's a way to do it well and there's a way to do it otherwise.

Personally, I would have hoped that given the size and nature of the comments that one of the two writers involved would have reached out to the other and had a more direct discourse via email. Perhaps something that could have been blown out for a great back and forth collaborative article or if nothing else something that could have helped create an article providing a resolution. So far, two Contrarian pieces in I still feel we are far from an understanding.

And let's be fair, as an editor this is probably something I should have pushed for. I know it's something I've tried to do in the past with Gates and other writers as well. The success has usually been in simply coming to an understanding rather than a blown out back-and-forth article but it's still been a success. In fact, this sort of back-and-forth and my desire to engage in it was the inspiration for my Cage Match series in the first place! It's entertaining as hell!

Thanks for reminding me of that, horrorfan. In terms of pushing the quality of the site both from an entertainment standpoint and an informative standpoint that sort of discourse could be incredibly valuable. I'll be passing this note along and hopefully that helps us improve how we go about this in the future.
January 27, 2014
Votes: +0

horrorfan said:

horrorfan
Better as comments than an column? The more I read these columns, the more I feel that this contrarian theme is more suited for comments at the end of a target article, rather than as a column itself. It would help make other columns more active, that's for sure.

If I were a newbie on this site, I'd wonder why there is a columnist whose focus is to 'criticise' another writer's column. Sure, it's good to have different opinions but I don't think a column is the best way to present it. Essentially, you have one writer trying to discredit another, which I feel takes away from the quality of the site.
January 27, 2014
Votes: +1

Atomic Wedgy said:

Atomic Wedgy
Huh? I didn't even get through half of this article before I got bored of the "my dick is much bigger than his" sentiment I was reading. An entire article (I'm guessing as I could not bear to finish reading it...) devoted to discredit a fellow DH columnist? It doesn't make sense to me. Maybe that was not the intention, but that is how it came across to this reader. Not a good read at all...
January 27, 2014
Votes: +1

Ryan Ma said:

Maaaasquito
MolsonX Definitely... numbers and stats can be cut and divided in infinite amount of ways to form an infinite amount of arguments. I guess mine has the backing of being right more than half of the time (60-80%)... That's where my credibility lies.

This column was a bit more evidence-based than his first one, but I still didn't get the vibe, of where the numbers are entirely wrong, and that his alternative is a better option.



January 26, 2014
Votes: +0

MolsonX said:

MolsonX
... @Ryan - both words and numbers can be twisted to form opinions. People do it all the time on these forums and in articles - usually each opinion has merit. In saying that, I really appreciated your Power in Numbers articles - I thought they were very well done and a must read.

I didn't think Demetrios' argument was great in his first Contrarian column, but thought the one above was a good argument. You both have given us something to think about, that's for sure, and it's been a solid debate.

Thanks to both of you. Cheers.

January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Words and Numbers As a final comment to your column.

Don't get me wrong, I am all for words and numbers living together in harmony.

It's just that I place a much bigger emphasis on the numbers than words, as words can easily be twisted or generated to form any opinion. As I have mentioned in my previous comments, if you're relying on opinions as your primary mode of defense, your word is as good as mine or anyone else's. It's much harder to refute tangible evidence when presented in front of your face, than to refute something when someone says "believe me because I said so."

As an aside, you could point out that I'm wrong about Ovechkin, Backstrom, Brown and even Gomez, but I'd counter that argument by just quickly sifting through my PDF, and although it's still a small sample size, I'm looking pretty good on 12/19 with a couple of them on par production:

Ryan 4 last 10
Duchene 7 in 12
Hossa 9 in 11
Staal 10 in 8
Vanek 16 in 13
Smith 6 in 10
Saad 5 in 13
Grabovski 3 in 11
JVR 11 in 13
Lucic 4 in 9
Schwartz 6 in 12
Richards 11 in 13
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Suggestions 1) I don't see how getting rid of the lower end players does much really? All it does is "prop up" the co-efficient value, at the end of the day, there is still a pretty clear correlation between points produced to ATOI even if I take into account just the "fantasy relevant" players.

Meaning even if I get rid of the "lower end" players, it doesn't change the fact that Crosby still got 59 points and Marleau tallied 37 in the first half. So how does getting rid of the lower end players, make any difference to the data? My argument is that it doesn't.

2) As for tiering, I've already done that as I sorted the data out by points produced. That naturally has already tiered the players in their respective point production. Crosby is at the top because he's produced the most, and it filters down by points produced down to Spencer Abbott who has 0. Isn't that tiering in itself? Is it something different you're expecting? Did you want me to tier it as in who to expect for an "uptick" and who to expect for "regression" because I did that in my 2nd column by labelling the players, Buy Now, Lay By, Farmer's Market and Fire Sale .

3) Once again that's not my prerogative with these columns. I wanted to stay away from the "opinions" and reasoning behind why a player isn't producing what they should be producing. At least lean away from resorting to "opinions" as the primary mode/defense of arguing a point. For me it's providing the tangible evidence through numbers to justify a position and not in your case defending a position through speculation (Brown).

I'm completely a blue-brained thinker and like to look at the facts and figures to justify a position, while others are more "big picture", "go with the flow" type of thinkers. Unless someone brings numbers and figures to the table, I'm just not a very strong believer in "opinions" being a powerful source of reasoning.
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Gomez You're looking at it from a skewed perspective.

First off, you're analyzing this from Gomez's "career year", which is an outlier year in itself, if you look at the data everything else fell into place. out of the 14 years you found only 1 that didn't fit then you want to claim that this model is flawed because it was incorrect 1 out of 14 times?

As Davidgoldburn said, this essentially reinforces my point that numbers tend to normalize more than not.

As for his breakdown, Steve beat me to it and I'll rehash some of his points.

05-06 is completely a stastically irrelevant year and completely non-comparable to 08-09 even now in 13-14. It was the year after the lockout, and there were PP galore with the new rule changes.

The average goals scored per team was 2.93, this year it's 2.56, so you're looking at about a goal less per game than back in 05-06. In terms of PP opportunities, 05-06 there were 5.85 PPO per team per game, 08-09 4.16 and this year 3.34... On average there was 1.03 PPG scored per game in 05-06, 0.79 in 08-09 and 0.61 in 13-14.

So I echo what Steve said, this isn't an apples to apples comparison.

Look at the scoring leaders from the 05-06 season.

Thornton 125
Jagr 123
Ovy 106
Heatley 103
Alfredsson 103
Crosby 102
Staal 100
Kovalchuk 98
Savard 97
Cheechoo 93
Hossa 92
Richards 91
Selanne 90
Spezza 90
Gionta 89
Jokinen 89
Sakic 87
...
Prospal 80
Gagne 79
Rolston 79
Nylander 79
Hemsky 77
Straka 76
Stillman 76
Afinogenov 73
Kozlov 71
Stoll 68

I wouldn't be surprised if you made the case for each and everyone of these guys above that you'd have the exact same scenario as you point out with Gomez. How many of these guys had career years that year and has never reached those heights every again despite perhaps garnering more ice-time in later years?

The power of numbers isn't meant for comparing cross-annually numbers, not especially when you're going to use them in an "outlier" year like 05-06. It's meant for comparing seasonal data from the "same cohort" of the same season. Notice how I don't use player X and compare them to 08-09 or 12-13 data, I compared them to their same cohort of the same year's data. Someone playing 18 mins in 05-06 would have dramatically different point totals than someone garnering the same in 13-14, so I don't really see your point in comparing Gomez playing 18:46 in 05-06 to Gomez playing 21:03 in 08-09.

In fact, if I used my model it seems to fit exactly what I'm trying to preach...

05-06 ATOI for players who tallied around the 84-point Gomez mark, 19:54, 17:53, 19:55, 18:57, 16:48, 19:04 and 18:46... Gomez is smack dab right in the middle of exactly what he should have been producing.

08-09 ATOI for players who tallied around the 58-point Gomez mark, 18:06, 19:55, 17:11, 19:55, 19:27, 20:17, 18:57, 16:49, 21:03, 18:54, 15:26, 18:53, 18:27 and 13:39... Gomez' ice-time was a little bit elevated compared to the "average" numbers, but is it all that far off from Frolov (19:55), Hejduk (19:55), Smyth (20:17) and Kesler (19:27)? Once again he was producing numbers exactly where he should have been.

So I don't see where "My efforts in these two articles were to point out the weaknesses (at least those perceived by me). I saw “Power of Numbers” and “most important factors” and felt it was incomplete." you have acknowledged this? You've presented a few "outlier" cases to point out the flaws of my model. But once again, my model according to the data only fits about 80% of the data, so there is a 20% error rate with the data. Pointing fingers at the 20% and saying "see this is wrong, that is wrong and this doesn't fit, that isn't doing what you say it's supposed to..." isn't convincing me that my model is completely worthless and untrustworthy or that your "opinions based informational" approach is much more accurate and a "better alternative".
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Tampering Data I did have a chance to have a chat with Lil_Rob about perhaps selecting just "fantasy relevant" data, but nixed that idea because it doesn't present the "full view" of the big picture and removing data like that really doesn't strengthen my point. It also brings the question of who deems what to be "fantasy relevant" and what isn't.

I will admit that if I took the data of say just the top-150 forwards, the correlation of the 3 factors aren't nearly as strong as they are if looking at the whole data set.

With that said, it's probably due to a lot of variance with such a small data set.

I mean if you look at the 37-pointers at the moment and compare it to ATOI, 17:47, 14:37, 17:10, 17:26, 18:25... You can already see a major outlier there. But generally speaking somewhere around the 17:30 mark is an "average" mark for a 37-point producer.

Now if you look at 38-point producers 19:34, 16:25, 20:22, 18:28, 19:06... You can see all of them are much higher than the 37-point producers. Does that mean that the definitive difference of a point is 1 minute extra per game?

Now look at players who have posted 48 points, 20:38, 19:02, 20:02... Is there a trend? Well I would argue that the big difference between a 37 point producer and a 48-point producer is about 2:30 more per game.

The power of numbers model is much more powerful when you're comparing it a "whole population data set". It is easy to pick apart the model if you're just hand selecting "outliers" and saying well this player doesn't fit the model as you proclaimed, so doesn't this one, or this one... If you randomly selected any player given nothing but, TOI, % of team PP and SOG data you could within probably 65-80% accuracy pinpoint within a ballpark figure of where their points totals are.

My big assumption is that "all players" will normalize towards the "average" somewhere down the line, and that I'm identifying those who are outliers at the moment who should normalize. Realistically speaking only 75-80% (if you look at R^2) fit into the model with any given consistency, so there's always going to be 20-25% who don't fit the model.

What you have to do is look at the power of the numbers, and rather than "nitpick" to say well this doesn't fit, that doesn't fit, look at the 80% that do and believe that the numbers don't lie.
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Brown Another note on Brown, do I think given his circumstances that he will improve (albeit slightly). Yes I do! But will it be something dramatic where he reels off a point-per-game the rest of the way. Hells no! But do I see it improving to say 0.5-0.6 the rest of the way. I'd be willing to take that leap of faith.
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Brown It's fine and dandy that you've brought up all these articles dated 4 months, ago... but once again it's all speculation that you think the injury is lingering and the cause of why he's under producing...

I'm disinterested in finding out the why, is because I 1) don't really know the truth behind it and it's not up to me to speculate and 2) Nor do I really care. The purpose of my column was to identify "outliers" not come up with speculations as to why.

Even all of the "evidence" that you posted, no where does confirm that his poor play this season is injury related, that's your own conclusion that you've drawn to fit your argument.

13/09/13 - ok, he had a "minor" injury
23/09/13 - ok, he missed practice, might miss opener
28/09/13- he's still hurting a bit, "but it's just a bruise"
01/10/13 - slow start due to training camp
07/10/13 - timing is off, but he is wearing a brace from an injury from last season
01/12/13 - nothing about his injury at all, just saying that he's underachieving and poolies expect more.
04/01/14 - says he's "lost his game", but nothing regarding injuries holding him back or anything...

"Information that can be useful in forecasting his point totals for the rest of the season."

I agree, but what you deem "information" is what I question. I would classify that more as speculation more than "information". You're speculating that Brown's lack of production is perhaps due to lingering injury issues or him missing camp. From the evidence that you've posted, I'm not getting the same feeling. So really do you expect me to just take your word for it that it's completely due to Brown's injury that's the reason behind his drop off in production? I mean what tangible evidence have you provided besides "your opinion."

Here's the numbers argument.

2012-13: 46 GP, 29 points (0.63), 142 SOG (3.1 SOG/G), 19:30 ATOI and 53.2% of Kings' PP opportunities.
2013-14: 52 GP, 16 points (0.31), 135 SOG (2.6 SOG/G), 16:29 ATOI and 38.6% of Kings' PP opportunities.

Maybe perhaps it's because his responsibilities have been dramatically cut compared to last season. he's dropped nearly 3 mins per game on average and he's been completely relegated to the 2nd PP unit as opposed to the top unit with Kops.

May be it's because of the decrease in the 3 key indicators of success as to why his production has regressed, or I can take your word for it an chalk it up to him being hampered by a minor injury that he suffered back in Sept.
January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
Ovechkin One thing to keep in mind for my player comparisons is that I'm comparing a certain player to the "average" NHLer. The "average" NHLer being a more assist-heavy point producer than a real goal scorer. Ovechkin isn't your "typical" NHLer, which is why the numbers aren't falling perfectly in line, but they are starting to "normalize" a little bit. He is an "outlier" in itself.

"Ryan doesn’t try to understand why Ovechkin didn’t rank as the best player after all those games, he just holds onto normalization and says that he is a good buy candidate. Understanding what went wrong is not part of the equation."

I'm not trying to "not" understand why, I'm saying that it's not my job to. The goal of the column/PDF was to identify "outliers" in the data, not to formulate the reasons behind it. The reason why I don't is because that's what makes it "opinionated", and I'm staying away from that because it brings it directly into the apples vs. oranges argument. I don't know as a fact why Ovechkin isn't producing "average" NHL. I could opine that it's due to a lack of assists as to why his numbers are so low and he has the same point total as Backstrom, but there may be more underlying factors. Maybe there's a clause in his contract that he gets $10 mil if he hit 70 goals, which is why he's shooting so much, I don't know... Maybe he's not wearing his lucky assist jocks... All of those are "opinionated" reasons to justify, does that make them right? Your word is as good as mine, when it comes to using words to justify something without tangible evidence.

If you look at the small sample set, 9 assists and 40 points in the first half (38 games), compared to 4 assists and 10 points (10 games) so far in the 2nd half, you can already tell that the numbers are normalizing...

The "power of numbers" exists when comparing it to the population as habs7079 mentioned, of course when you compare it to an individual vs individual basis of course there are going to be discrepancies.
January 26, 2014
Votes: +0

doulos said:

doulos
Hmm. Ma's ideas seem to acknowledge that we will never have perfect information as fantasy hockey GMs, and so it's best to track trends and averages to try and target buy/sell candidates. Obviously to use those trends as religion and to blindly ignore any further information is silly, but Ma never suggests that either, so after reading both of these articles I still am unclear on what is being suggested by Demetrios.

Pointing out specific players and asking why they don't perfectly match Ma's models completely misses the point. Entirely.

"Ovechkin's numbers suggests he should be first or second in the league in points, but he's not, therefore the numbers are a poor way of evaluating players."

That seems to be the point being made and I'm stunned by how little sense it makes when you step back and look at it.







January 26, 2014
Votes: +0

Ryan Ma said:

Maaaasquito
habs7079 I think Habs hit it the nail completely on the head.

When looking at the "averages" I'm looking at the entire population of NHL players. So when I compare a certain player to the "average" player production, that's how the comparisons work. The power of the numbers if looking at it from a "population average" perspective, not every single individual player (there are players above the line, below the line and then there's a "line of best" fit to generate an average number. What I'm doing is looking at Ovechkin, Backstrom, Brown and comparing their current numbers with the "average" numbers that their peers are producing and deciding whether or not they are "on target", "elevated" or "underachieving". Of course when you look at an individual and compare it to the average, there is always going to be some variance. I worked out the correlation co-efficient to be something around 0.8 and a R^2 of 0.80, which means only 80% of the data fits into this model. Which still means that 1 in 5, I'm off track. There's no 100% model out there and I don't claim this to be. But if you look at the data from a "whole" population perspective, it fits more than it misses.

There is a noticeable trend, that if I were to break down the point production of players 35, 50, 65, point-per-game that there is a positive correlation to each of my 3 big indicators of success.
January 26, 2014
Votes: +0

metaldude26 said:

metaldude26
Gomez If we are looking at the Gomez example one of the things that has to be emphasized is that Gomez reached his career high of 84 points in the year following the lockout when power plays and scoring reached highs that declined in each subsequent year. This is not an apples to apples comparison because the NHL in 2005-06 was much different from the NHL in 2008-09.

I also take issue with the fact that you were unable to find power play ice time for Gomez. This stuff is readily available on NHL.com and a quick look would show you that Gomez saw less power play time in 2008-09 than in previous years and more penalty kill time. Not all ice time is created equal, right? That's a valid criticism so maybe a quick exploration of that would have unearthed an explanation people could use but I suppose that would involve dealing with more numbers.
January 26, 2014
Votes: +1

Dobber said:

Dobber
... You know why neither side will 'win' this discussion? Because they're both right. But then, in the last line of this column Demetri acknowledges that. We need both words and numbers to make the right decision. But all too often, the "words" aren't as readily available for our dissection. So we do the best with what we have smilies/cool.gif
And so, the best we can do is often just the stats. And while information on a player gives you an edge, I think the stats and trends that Ryan brings to light gives you more of an edge. Of course you go with both if you can - that's just common sense. But if I had to pick one or the other, I'd pick the stats.
January 26, 2014
Votes: +1

habs7079 said:

habs7079
... I think the discussion generated by Demetrios' and Ryan's articles has been fantastic. I feel one thing that has been touched on but not directly addressed is the fact that statistical analyses like the regression analysis Ryan performed give us information about POPULATION averages.

There is a major distinction between the way populations behave and individuals within that population behave (I deal with this every day in my line of work -biomedical research- and can provide volumes of articles that illustrate this point). As Demetrios has taken the time to point out, there are a lot of unquantifiable factors that go into determining player scoring in a given season that are unique to individual players (injury, team, personal life, confidence. etc). When we look at data on the population level, we in effect are only looking at trends/behaviors shared across all individuals in that population, and the larger the sample size we use, the more we reduce the impact unique individual behaviors have on our analysis. So, the larger our sample size, the better our data will model the "average" NHL player, but we also LOSE information about how individuals behave.

For these reasons, when we take small sample sizes, unique player attributes will have greater effect and the probability that the sample population will fit the model is low. When we take larger sample sizes, unique attributes have less weight and we should expect our sample to more closely model the entire population.

So with that understanding, Ryan’s analysis is not very accurate when taking a single player and then predicting their future production. A sample of n=1 is so small that there is a very low chance it will perfectly fit our model: this is why Demetrios is able to find so many examples that break the rule. However, if we roster a team of 10 forwards, the power of Ryan’s modeling becomes greater. If we roster a team of 25 forwards, Ryan’s modeling becomes even more effective. If have a fantasy roster containing every player in the league, Ryan’s analysis will be very good at predicting how that team will perform in the future.

Ultimately, over the course of an entire fantasy season and looking at a team’s entire fantasy roster, heeding Ryan’s analysis will more often than not improve your odds of finding productive fantasy players. It will not, however, predict whether Dustin Brown will turn his season around, but merely tell us whether or not Brown is behaving like an “average” player. Beyond that, we need to look at the unique factors Demetrios has discussed in order to understand WHY Brown is or is not fitting our population model.
Hopefully that was more clarifying than confusing.
January 26, 2014
Votes: +2

davidgoodburn said:

davidgoodburn
... A couple points:

Your ovechkin Backstrom comparison brings up an interesting point. I would suspect that team shots while on ice (rather than individual shots) had a better correlation to points. I would suspect that would serve to normalize backstroms points to Ovechkin and I've a better sense of regression. However since you probably cannot track that you just have to accept that Backstrom is an outlier to that factor, which I would guess he has been his entire career.

Your Gomez example only serves to reinforce Ma's point. His 84 point season is clearly an outlier based on shooting luck and you would expect a regression to the norm the next year (which did happen).
January 26, 2014
Votes: +2
You must be logged in to post a comment. Please register if you do not have an account yet.

busy