View Full Version : Database verification
dehere
17th June 2004, 12:15.09 PM
This post relates to a recent thread in the handicapping forum which involved an angle Jerry (ggpagels) had highlighted. The angle was nCLA<>3 nTRN>=400 nTRCH 1 or 2. Anyway, part of that thread suggested that one or more of us may have a screwed up database caused either by failing to download all race files or, perhaps, by duplicate database entries. Briefly it appears that Jerry, Victor and I each came up with different ROI for the same angle albeit I also noted that we did not include the same period of time for our various testings.
The purpose of this post is to see if there is anything that we all might do to compare our separate databases with some master database to assure synchronicity. I am sure there are all sorts of issues regarding sharing data without having a proper membership or such, but it would be nice to be able to synchronize our various databases if we could show that we had a fairly complete database over a given period of time, say even once every month or so.
For example, I note that my HTR files take up about 180 megs of space each month, which ain't all that big. Perhaps a portion of homebase2.com's site could be set aside with a password protected folder that we could access by ftp to check our separate database for synchronicity with a software program such as Beyond Compare. If it would help If I would be happy to offer ftp space on my website for this purpose.
I'm probably overlooking potential problems with this, but it would sure raise the comfort level and help us all avoid the garbage in - garbage out aspect of an error laden database.
Rick
17th June 2004, 12:26.51 PM
The only way I know of is for all of you to agree to check the same tracks for the same period of time.
Then the question comes up "Does paceline mode have any affect on any of the criteria used and are you all using the same paceline."
Have fun. :D
ggpagels
17th June 2004, 01:55.47 PM
Dehere-
I think some of the problem of varying results is not database errors but tracks in the databases. I don't download every track into my database. I usually screen about 8-10 tracks every day. Since I'm on the west coast I have a complete database of California races but do not have a single race from your home track, CNL, in my database. I just download tracks I play in tournaments or wager on. I have nothing against CNL. I just don't have enough knowledge to feel comfortable handicapping CNL races. When KM does one of his "all burger" studies I always run the same study on my database to compare the results as sort of a sanity check on the tracks I follow. That is how I came up the trainer 400+ research.
Victor
17th June 2004, 05:28.45 PM
Paceline 5, all tracks, all results.
In my opinion, a completely 100% accurate database is NOT the answer to all of our prayers. Thank God! We each have a brain that contains far more power than that.
dehere
17th June 2004, 08:19.04 PM
Darn, and I thought I had found the holy grail. Alas, I'll just pray for something else then. :)
hurrikane
18th June 2004, 09:40.57 AM
victor,
I'm confused by your statement. An inaccurate db can only give you erroneous results and cost you a lot of money. You always strive for 100% accuracy although seldom acheive it.
I think it is probably the tracks that are creating the difference. But, check for duplicate races, races without results. Check often, Check close. It will save you a lot of money in the long run.
This has been a public service annoucement from the Voice of Experience.
Thank you. :D
MikeDee
18th June 2004, 11:05.01 AM
I agree with HK. We are all downloading the same data and using the same software. Putting it into similar tables
So the only way things can be different are
different time period
different tracks
different races at the same track
different fields in the filter
Very important to make sure that there is no duplicate data in your tables.
Cliff
18th June 2004, 12:44.07 PM
Partial cards due to inclimate weather and cancellations, etc. might be another reason.
Would be interested in hearing what others do with those.
ggpagels
18th June 2004, 01:47.25 PM
Cliff-
I export my racefiles, results, and charts once a week into my database. One nice thing about the HTR program is that the race files are marked with a double asterisk(**) on the HTR2 Main Screen if the racefile-results-chart all match. For instance, on 5/30 CD only ran 4 races because of severe weather. I didn't handicap that day so I was unaware of the cancelled races. Before exporting at the end of the week I noticed that the 5/30 CD files were not marked with **. I checked the individual CD races and saw that races 1-4 had results and charts but races from 5 on did not. I went to Equibase and found out about the cancelled races and went back into HTR2 and deleted the unraced races. After reopening HTR2 the CD files showed ** and were complete. This is a sure fire(and easy way) to verify you are not exporting incomplete data.
Victor
18th June 2004, 05:09.23 PM
hurrikane,
I was speaking philosophically. I certainly do strive for accuracy and do not export without results. :)
When it comes to perdicting the future, I don't think that past results contained in a 100% completely accurate database are any match for the human mind. I'm not calling for any return to inaccuracy, I'm saying we are all more than capable of exceeding the predictive power contained within the database, which I consider to be only a tool.
zimal2
18th June 2004, 08:10.27 PM
Once one has already created a very large database/table, what is the best way to double check for duplicate files? Is there one? Or does the check have to be done earlier?
Zim
Rick
18th June 2004, 08:22.42 PM
Access has a Wizard built in to create a query to check for duplicates.
But as long as your Primary Key is set correctly, you don't have and will not have any duplicates.
If you want to create the query, click on Queries, New, Find Duplicates Query Wizard.
Victor
19th June 2004, 12:58.06 AM
Ran the Wizard on the ALL_HX4 table.
tTRK Field tDATE Field nRACE Field tPGM Field NumberOfDups
hurrikane
19th June 2004, 09:28.43 AM
Cliff,
I check races without results and toss them. Not sure of any other way to take care of the partial cards run.
One thing to consider if you are considering playing spot plays is to keep a table of races without results or scratches. If you have spot plays based on rankings, ie Ev1, these change with scratches. You will see very different results when scratches are added as the rankings change. So, unless you are sitting and watching the races all day, you will not be able to make the adjustments on the fly.
HBee
19th June 2004, 10:49.50 AM
Originally posted by hurrikane
Cliff,
I check races without results and toss them. Not sure of any other way to take care of the partial cards run.
If you select 'only export races with results' that should prevent adding the races not run into your database.
dehere
19th June 2004, 10:58.51 AM
Hurrikane, help me out please - why would you prefer to keep horses (and their related ratings) who were scratched in your database? When I run tests I want to see the relationship between ratings and results. I can't do that if the rankings which are in my database are for horses who never ran. For example, if I'm trying to see how top ranked Fr1 horses perform in sprint races the results would be misleading if there are top-ranked Fr1 horses in my database who never even ran. Maybe I didn't understand your point.
Victor
19th June 2004, 11:12.08 AM
dehere,
It's a tough call-- but I agree with you in that I don't see the point of saving records
of horses that did not run. It is true that scratches change the rankings. If you bet a spot play before scratches, the end results you save to your database are different than what you did in fact bet. I can live with that because for the most part, I find them to be close enough.
hurrikane
20th June 2004, 05:18.41 AM
I'm going to disagree Victor. I find they are not close enough at all.
if you have an fr1 play that has 19% winners and a 1.10 roi how do you know which of those bets you would have made? bottom line is you don't. This is one reason many spot plays do not work going forward. You can only truely test what the results would have been on what horses you would have bet. If you are playing your bets after scrathes you would be ok with just the results. If you are playing them before scratches you are not testing the same conditions you are betting.
I have seen many bombs come in as the result of a scratch. If you don't take that into comparison you are not looking at reality and you will continue to wonder why your spot plays do not move forward with a profit.
Victor
20th June 2004, 08:32.28 AM
Hurricance, do you then add the results manually to your table? When results are added automatically, the program also scratches horses. If I want to keep a table without scratches, and test precisely how well it predicts reality, what are my options?
Rick
20th June 2004, 09:24.03 AM
Victor,
You need to go back and re-read this thread.
Cane never said he keeps the races with the scratches included.
In fact he said just the opposite.
Victor
20th June 2004, 09:36.49 AM
Originally posted by hurrikane
One thing to consider if you are considering playing spot plays is to keep a table of races without results or scratches.
And what should I do with this table then? If I'm keeping it, there must be a reason for doing so. I would think it's value could be to test the validity of spot plays made before scratches.
I know this has been discussed many times in the past, nevertheless, here's a two-part question:
1) What is the best way to handle the scratched horse problem?
a) ignore it.
b) spend time scratching and re-exporting and in effect, defeating perhaps the major attraction of spot-plays in the first place, the ability to be free to do other things. There is no way to fully account for late scratches at all tracks all day long.
2) If I choose to ignore scratches, that is, make bets before any scratches, can I build a table that could be tested without manually entering results?
Rick
20th June 2004, 10:01.10 AM
There is nothing wrong with keeping more than one table.
The whole thing is what works for you.
There is no one right way or wrong way of using it as long as it works for you, or is something you want to check out.
In general, there are some guildelines you should follow if you are going to maintain a table of past races, but you might find a good reason not to follow the general guidelines.
Some (but not all) of those guidelines are:
Keep duplicate races out
Keep only races with results
Do not mix pacelines
And then, And then .. :confused:
Rick
20th June 2004, 10:11.35 AM
One way you could try (I am not going to try it to make sure it work but)
You can have two tables with the same races. One table is all your races without results and without any scratches removed. Your second table is the same races with results (adding results removes all scratches).
You can then create a query using both tables and using the date, track, race and program # fields for creating a relationship between the two tables. You have to modify the relationship so that all the records from your table with the scratches will show in the query. The defuult is only those records that are in both tables. You can do that by clicking on the joining line between the two tables list in your query (right or left click?).
You can see if that will work. You might have to play with it or seek suggestions from other users.
That way you might get the results for your races with the scratches included without having to add them manually.
Victor
20th June 2004, 10:29.42 AM
Originally posted by Ricks
Keep duplicate races out
Keep only races with results
Do not mix pacelines
But my table does not fully represent the reality of betting spot-plays before scratches. The table assumes perfect knowledge and action vis a vis the scratched horse. I would like to build a table which tests the past performance of spot-plays made before scratches. Could this be done without manually entering results?
Victor
20th June 2004, 10:32.16 AM
Thank you very much for your suggestions Ricks, I will try them.
Victor
dehere
20th June 2004, 11:23.45 AM
What I get from all of this discussion seems to be quite simple - at least for the guy who plays at home, in front of the computer, where he can input scratches before making actual wagers. Basically, with an accurate database, void of duplicates and including only races with results, this happy guy can run all sorts of tests using potential spot play possibilities and determine, based on actual race outcomes, which of these spot plays make sense. Then, provided scratches are accounted for, this same happy guy can search that day's card for any spot plays that may pop up. Then this happy guy becomes even happier because, by golly, the spot plays pay off and he wins beaucoup bucks. Then, to complete this rosy scenario, the happy guy's kids give him an advance copy of Clinton's memoirs and tickets for the opening of Fahrenheit 9/11 as a Father's Day present. Am I missing anything here?
Victor
20th June 2004, 12:04.22 PM
It's nice work, if you can get it. :D
Happy Father's Day!
vBulletin® v3.8.4, Copyright ©2000-2012, Jelsoft Enterprises Ltd.