PDA

View Full Version : Current PPX Thoughts…


DanG
27th May 2007, 11:47.51 AM
A couple months before Ken programmed the PPX function in HTR I was researching something similar. Naturally, he went beyond what my programming and imagination allow and it’s become an excellent tool when applied in context.

Pg 15 of this [July / Aug 2006] HTR newsletter speaks to the PPX basics.

http://www.homebased2.com/km/pdf/JUL-AUG%202006%20HTR%20NEWSLETTER.pdf

January of 2006 Ken wrote an article on “Quantifying Jockey Performance”. He concluded that within reason for each 100 points in improvement / decline 1pt can be assigned to a PER rating for example.

Pg 7 of this [Jan 2006] HTR newsletter speaks to “Quantifying Jockey Performance”.

http://www.homebased2.com/km/pdf/HTRMonthlyReport-JAN2006.pdf

Combining these two concepts my goal was to establish plus / minus thresholds to the various PPX categories. Each category has its own range and characteristics so I’ll try and break them down individually with what I’ve found so far.

1st: In part because it made the research easier and also because I’ve found it to be very powerful, all statistics are based upon the paceline that ‘paceline-5 is choosing. (The information is in export HX7 and when two lines are chosen it displays the furthest line back in days back.) I like basing the PPX theory on a representative paceline rather than only using the most recent. It’s also lost on the public as they are obsessed with any change from the last race.

Now,…I know you’re saying if you don’t export to access do I have to flip between the PPX screen and any past performance screen to see which line PL5 has selected. Yes, for now. Far be it from me to make more work for Ken. How he keeps up with everything now is just amazing, but members have said he does custom work at terrific rates and perhaps he could hook you up if its something you think is worth wild.

Enough of that…How to rate each category and what categories to use?

I.e.; Previous rider =200, Today’s rider =300. ((300-200)/100) = +1pt which can be applied to the PER rating for example. (Obviously if the situation is reversed the result is negative and it’s subtracted.)

Among the categories I’ve found effective I’m using…

• Jockey
• Trainer
• Pedigree
• K
• Pace (custom)
• HTR (custom, whole number)
• PP (Custom)
• FLD
• TPG (Custom)

Each category has their own unique ‘range of min / max, so a scale was needed as the divider. Ken used 100 for the Jockeys, so I tried to approximate that # using standard deviation for all and it worked out OK. (You ‘true math pervertS will have a much more clever approach, as I never went beyond public high school AND I’m talking my beloved NJ. :D) The STDEV for the Jockey rating is 75 and in The same ballpark that Ken used, so I carried it through to the other ratings.

(BTW: I’ve found the workout rating needs a minimum threshold for today’s rating for this to be effective. A 10 point move from 65 to 75, just doesn’t produce a linear table against the entire population. We all know what 75 – 85 + means ;))

Standard Deviation Variable; (If its not listed it adds / subtracts using the whole number rating without dividing)

Category Divider
Jockey 75
Trainer 93
Pedigree 111
K 12
HTR (custom) 16
PP (Custom) 3
FLD 2

Example: Previous (PL5) Trainer = 200, today’s trainer = 400. ((400 – 200)/93) = 2.15 etc…

I set up the previous paceline 5 data of the above categories and used the STDEV as the dividing variable and rounded to the nearest integer. The majority of ratings fall between -4 and +3 (8 individual data points.)

1st, one factor that those of you who are much smarter than me (meaning all of you) will want to apply is to “limit” today’s threshold with certain ratings. Obviously if today’s K is 80 and the previous is 50, its not the same 30 point net gain than if the ratings were 110 and 80 respectively. I was not smart enough to build that into my initial research, but I do apply this reasoning to my day to day output.

How to rate the results? Again, you will have a better approach, but I used PER =1 and the animal in question must have started at least 5 times to eliminate the animals with huge fluctuations. The “base” IV value for PER=1 is 1.93. All subsequent data is measured against that and all horses in the following tables are PER =1.

ADJUSTMENT CATEGORY
PLUS / MINUS JKY TRN K PED HTR FLD
<= -4 1.30 1.2 0.25 1.6 0.29 1.29
-3 1.37 1.32 0.44 1.61 1.06 1.30
-2 1.59 1.37 0.67 1.76 1.42 1.61
-1 1.86 1.84 1.58 1.95 1.65 1.82
0 2.06 2.05 2.17 1.98 1.88 2.00
1 2.18 2.37 2.04 2.01 2.16 2.14
2 2.40 2.71 1.92 1.97 2.41 2.24
>=3 2.85 2.75 1.77 2.32 2.8 2.34

Again…I can’t stress enough how important establishing a min threshold for today’s rating that is NOT reflected in this data. A “2 point” move in Pedigree is MUCH more significant from 250 to 450, than 50 to 250 obviously.

The next three categories are a little off the beaten path and require exporting to a DB etc, or a custom HTR adjustment.

• “PP” ~ Prev post position vs. today is only applied when the prev PP was at two turns. Not enough impact comparing sprints to sprints and the one turn miles etc.
• “PACE” ~ is my own rating that attempts to quantify the pace pressure in a race. It can be approximated with HTR’s Q5 rating and I think would be a valuable addition to the PPX screen. No division is necessary in my version, but if using the strict Q5 rating you might want to experiment with divided by at least 2.
• “TPG” ~ is just HTR’s TPG rating without the +,- symbols and expressed numerically. I.e. “A” = 1, “F” = 5, “N” =0, etc… (BTW: I did eliminate “N” from this data to avoid comparing today to zero in the past.

ADJUSTMENT CATEGORY
PLUS / MINUS PP PACE TPG
<= -4 0.62 1.59 XXX
-3 1.62 1.64 XXX
-2 1.72 1.76 1.39
-1 1.80 1.89 1.71
0 1.88 1.95 2.04
1 1.98 2.07 2.12
2 2.02 2.11 2.25
>=3 2.36 2.25 XXX

Once again…This test was not set up properly and I now use a conversion table that is loosely based on these principles, but better reflects the power of today’s relevant HTR data. I hope this sparks a few ideas among the brilliant minds out there and the good thing is I feel most of you are way ahead of me and I’m certain Ken is.

Happy Memorial Day weekend to all the HTR family and in particular those who have served and sacrificed so much.

njcurveball
27th May 2007, 01:27.22 PM
This is great stuff Dan!

A testament to your genius and the programming of Ken.

I would say this post alone is worth more than most of the books published by the DRF!

Thanks for sharing!

Jim

DanG
27th May 2007, 02:24.32 PM
Thanks Jim,

I’ll certainly second the programming of Ken.

This sucker was much harder to post than to write. I must have posted this 5 times and still couldn’t see it. :eek:

Enjoy and I hope you get to take in (or in your case) play in a ball game during this holiday weekend.

Rick
27th May 2007, 02:35.34 PM
Try the Ctrl+F5 (Hard Refresh) trick when you can't see a post you just made.

It might help and might not.

I don't know what causes this problem.

DanG
27th May 2007, 02:47.05 PM
Try the Ctrl+F5 (Hard Refresh) trick when you can't see a post you just made.

It might help and might not.

I don't know what causes this problem.
Will do Rick, thanks.

I thought it was a feature Ken and yourself installed when a thread contained excessive hot air! :D

MVM
27th May 2007, 03:25.25 PM
1st, one factor that those of you who are much smarter than me (meaning all of you) will want to apply is to “limit” today’s threshold with certain ratings. Obviously if today’s K is 80 and the previous is 50, its not the same 30 point net gain than if the ratings were 110 and 80 respectively. I was not smart enough to build that into my initial research, but I do apply this reasoning to my day to day output.


Dan, you don't give yourself nearly enough credit.

By using StDev you DID build this into your research.

In a normal distribution, when you move from -1 to 1, you are crossing @68% of all values. When you move from 1 to 3 (also a 2 StDev leap), you are only crossing @16% of all values.

Now most of the HTR ratings do not produce a NormDist, but I don't think this is critical in regards to what you are trying to do.

DanG
27th May 2007, 03:44.45 PM
By using StDev you DID build this into your research.

In a normal distribution, when you move from -1 to 1, you are crossing @68% of all values. When you move from 1 to 3 (also a 2 StDev leap), you are only crossing @16% of all values.

Now most of the HTR ratings do not produce a NormDist, but I don't think this is critical in regards to what you are trying to do.
Thanks for the confidence Mike.

I was serious when I said I can only discuss math up to a certain point with you guys and gals. I take what I know then I try and make it work with what my racing experience tells me.

Assuming the premise of a rising “PPX” value is valid, are you saying that a “3pt” rise for example is identical regardless of the size of today’s rating?

Ped: 400 to 700, vs. 100 to 400?

I’ve found most of these HTR ratings have a ‘power threshold that when crossed they become exponentially more significant.

MVM
27th May 2007, 04:32.19 PM
PED is a tough example, as less races and higher purses make the 4 ratings apples2oranges comparisons when dealing with a single horse (DirtSprint to TurfRoute as an example).

There is a much larger spread in the Turf Peds than in the Dirt Peds so a better way to compare the 2 ratings for a single horse would be to say that he is a +1.2 in a dirt sprint and a +2 in a turf route (for the 400-700 scenario) or a -2.6 in a DS and a +.4 in a Turf Route (the 100-400 scenario).

The above numbers use a Dirt Sprint Mean of 305 and StDev of 79, with the corresponing Turf Route numbers being 325 and 187.5.

In the 400-700 case, the horse moves .8 StDevs up, but (in theory) moves from the 88th percentile to the 98th percentile (his PED rating is better than or equal to 88% of all Dirt Sprint entrants and 98 percent of all Turf route entrants).

In the 100-400 example, the horse moves 3 StDevs up, but in this case he goes from <1% to 65%. This (again in theory) does not automatically make him a better turf router than the 400-700 horse, but if they are close to equals in a dirt sprint, then the former horse, who is likely to move up further with the surface switch, may surpass the (based on Ped) more diverse 400-700 blueblood.

DanG
27th May 2007, 04:41.31 PM
Not only do you know what you’re discussing, but you presented it so even I could understand it! :)

Thanks Mike, well said.

tbrown
27th May 2007, 05:35.06 PM
Somebody is doing their homework!:D

N-I-C-E work.
Thanks for sharing.
You just got me looking at the comma charts for some nuggets, then you blow me away with this.

Mark
28th May 2007, 07:31.49 AM
Dan, you are a wellspring of information. I have read all your posts with the eager anticipation of being educated. This one is particularly fascinating.

Thanks also to Mike for the standard deviation primer.

DanG
28th May 2007, 09:06.17 AM
Thanks for the kind words Mark, Tom…

Trust me…The imagination all goes to Ken…I blame to much free time on Mondays for my ramblings.:p

One point about the IV’s of the K rating that I think illustrates why today’s rating has such a great impact and can create a bell curve in values.

RATING IV
<=-4 0.25
-3 0.44
-2 0.67
-1 1.58
0 2.17
1 2.04
2 1.92
>=3 1.77

As HTR members know so well, the higher the K rating. The higher the probability of winning across the entire population. If I set a minimum threshold on “today’s” rating of lets say 100, you would see a linear progression. An animal that is spotting aggressively (I.e. high % trainer) will constantly have K ratings of 100 or better. The rating only has a max of 15 points in growth at this point and it distorts the statistics. The (>=3) column in this table can only have a K of 85 and the majority will be much less.

BTW: As Mike pointed out…I grouped this information together so it wouldn’t resemble War & Peace, but I do separate by Dist / Surface and a FLD size adjustment must be made to certain ratings…

Example:

The average winning HTR consensus whole number sorted by FLD size. Obviously the consensus rating being a line score in nature would produce a near perfect score in a walkover (depending on bonus points) while in the derby one entry can theoretically only accumulate so many points.

FLD AVG HTR
4 65
5 58
6 52
7 47
8 43
9 40
10 38
11 35
12 34
13 31
14 29

Donnie
28th May 2007, 09:27.51 AM
Nice research Dan! Thanks for sharing your findings! You are carrying thru with thoughts I had way back when PPX first emerged! Using the PEDs exclusively in that screen helped me hit day money in the last tourney I was in at the Gold Coast.

DanG
28th May 2007, 10:22.44 AM
Using the PEDs exclusively in that screen helped me hit day money in the last tourney I was in at the Gold Coast.
Outstanding Donnie!

Donnie…I’m glad you checked in, I have a mortal lock and I wanted to share…

You were 1-9 to fire up your grill this weekend. :D

BTW: One reminder concerning the PPX Pedigree rating. When either race comes off the grass the rating does not change. Disregard it and move to the appropriate surface if it’s available.

Donnie
28th May 2007, 01:13.47 PM
Head for the windows Dan. You've got a winning ticket!
Grill has been fired up twice this weekend already.....once again tonight!

DanG
28th May 2007, 01:36.46 PM
Grill has been fired up twice this weekend already.....once again tonight!
I thought I smelled some Iowa Prime…:)

Medium rare Donnie with a choice Iowa baked potato and a cold adult beverage or three.;)

We have much to be thankful for Donnie. Let’s all raise our glasses in honor of this holiday.

Enjoy

DanG
29th May 2007, 07:39.00 PM
One final note;

Following up on Mikes point of the Pedigree’s rating having different ranges (standard deviation) depending upon the dist / surf.

All Burger = 111


DS = 097
DR = 126
TS = 154
TR = 155

Thanks to Mike [MVM] for pointing out this important factor in adjusting PPX information.

km
31st May 2007, 01:38.17 AM
Stimulating thread Dan, food for thought - thanks