Message boards :
Number crunching :
Too Late To Validate
Message board moderation
Author | Message |
---|---|
Chris Granger Send message Joined: 2 Sep 11 Posts: 3 Credit: 34,381 RAC: 0 |
Are the deadlines for work units really short for this project? My Linux machine was throwing up errors, so I attached with a slower Windows machine but finished my first work unit as invalid, with the error "Completed, too late to validate"... This isn't the fastest machine around (2GHz dual core) but neither is it the slowest. Can you look into this and see if something is wrong? |
AMDave Volunteer moderator Volunteer tester Send message Joined: 30 Aug 11 Posts: 41 Credit: 100,018 RAC: 0 |
The yafu deadline is so short that the BOINC Manager is preventing any other project from running while there is yafu work in the queue. Please extend the yafu deadline to at least 3 days, as soon as possible. |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 739 Credit: 17,612,101 RAC: 10 |
Hello, I will not extend the deadline. I need the results fast. Yes if a workunit is downloaded it is started immediate by Boinc. But I also let YAFU running and other projects running very well between YAFU workunits. Boinc doesn't download new YAFU workunits if all er finished. First other projects get cpu time according there ressource share. Only if YAFU ressource share is endangered Boinc downloads new workunits. I use nearly the latest Boinc version. |
Chris Granger Send message Joined: 2 Sep 11 Posts: 3 Credit: 34,381 RAC: 0 |
So a 2GHz machine is too slow to be useful? OK, detaching my Windows box... |
bellialuss Send message Joined: 3 Sep 11 Posts: 1 Credit: 7,867 RAC: 0 |
So a 2GHz machine is too slow to be useful? OK, detaching my Windows box... Not only your machine :D I have 2x 3.5 GHz and have same message...workunit should crounch 25min and now have 40min and remaining time "--" bug? |
zombie67 [MM] Send message Joined: 29 Aug 11 Posts: 38 Credit: 13,384,348 RAC: 0 |
I think there are a couple of problems here. 1) The deadline is way too short. Needs a few days added at least. 2) The first tasks were very short, compared to what is being issued now. So BOINC learned how short the tasks were, and then over-downloaded too many tasks. No way to finish them in time. DCF will correct for this over time. Reno, NV Team: SETI.USA |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 739 Credit: 17,612,101 RAC: 10 |
AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h. |
ChertseyAl Send message Joined: 29 Aug 11 Posts: 7 Credit: 11,046 RAC: 0 |
Good luck with that. I was going to post a good old rant about BOINC being about cooperation between the project and the volunteers, and how volunteers are pretty much the only thing that keeps the project alive, supplying processor power, paying for their own equipment, paying for the electricity, etc etc. But I won't. Will I continue to crunch this project? Probably. At least until I've hit the MM I'm looking for. As will many others I'm sure. Once the hardcore Milestoners are sated I'm not sure who is going to crunch a project that so aggressively takes over resources. Whatever, keep on crunchin' guys! ;) Al. |
zombie67 [MM] Send message Joined: 29 Aug 11 Posts: 38 Credit: 13,384,348 RAC: 0 |
AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h. This prevents a volunteer to download even 1/2 a day's worth of work. Not everyone is connected to the internet 24/7. Due dates should be no less than 24 hours. 48 to 72 is much shorter than most projects. 1-2 weeks is standard. Also, the server run time estimate is way wrong. It is underestimating run time, and causing voluteer's clients to download too much work. And then they are late and don't get credit for work done. Reno, NV Team: SETI.USA |
Pete Broad Send message Joined: 1 Sep 11 Posts: 1 Credit: 59,491 RAC: 0 |
Yeah, deadlines are too tight. Attached a couple of machines and they're both struggling to get the work back in time, I've noticed they quite often remain at 100% completed for an hour or more! Pete |
STE\/E Send message Joined: 2 Sep 11 Posts: 4 Credit: 21,947,717 RAC: 37,087 |
Setting your Preferences to anything other than .1 & you run the risk of not getting the Wu's back in time. Not many GPU users are going to stand for that very long as most like to carry more work than that for their GPU's. I haven't received any new work from any of the other CPU Projects since I started running YUFA even though their all set to the same Resource Share because YUFA just takes over everything. Not Nice at all ... Steve* |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 739 Credit: 17,612,101 RAC: 10 |
Also, the server run time estimate is way wrong. It is underestimating run time, and causing voluteer's clients to download too much work. I increased the estimates for new workunits. |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 739 Credit: 17,612,101 RAC: 10 |
I haven't received any new work from any of the other CPU Projects since I started running YUFA even though their all set to the same Resource Share because YUFA just takes over everything. Not Nice at all ... Steve* New Boinc client handles this very well. I even run YAFU in parallel with other projects. If there was to much YAFU work done and to less for the other projects, than Boinc doesn't download more YAFU work. At least on my i5 with a nearly up2date Boinc version. I need the results fast. If I wait some days, than there is a high possibility, that the number was already factored and the Boinc factorisation isn't needed anymore. So it was a waste of computing. |
Cruncher Pete Send message Joined: 28 Aug 11 Posts: 6 Credit: 29,599,009 RAC: 115 |
I also feel that there is something wrong with the set up as is. I am running a number of fast machines, i7 980's, the latest BOINC Version and am not running any other projects at the same time. Running 12 threads, it does not complete in time. Please check the number of my WU's submitted that says that it error-ed out, NO CREDIT. Tom needs results in a hurry, we are trying our best but at least give us credit for our efforts... |
AMDave Volunteer moderator Volunteer tester Send message Joined: 30 Aug 11 Posts: 41 Credit: 100,018 RAC: 0 |
I increased the estimates for new workunits. Excellent! That helps a lot. Thank you. BTW - are you saying that BOINC 6.12.x manages yafu tasks with other projects better than 6.10.x client? I have 2 identical machines, 1 now with 6.12.33 and the other with 6.10.58. I will run them side by side for a day. Akk. The 6.12.33 machine just downloaded 13 tasks. A few estimated at around 3.5 hrs, the rest are estimated at 1.1 hours. My General setting is to cache only 4 hours. Some of these will not make the 10 hour deadline if the client is to do any of the other project work as well. Something is amiss. I'll stop the other WUs and get as much Yafu returned on that machine as possible before the deadline. The 6.10.53 machine is operating ok-ish, but I just lost 2 consecutive 24 hours-WUs to computation error on another project on this machine since I put yafu on. That could be coincidence, but I am not sure my sociability testing is going very well. If the next WU on that project fails without yafu on it I can eliminate yafu as the influence. For the time being, I will restrict yafu to dedicated machines without other projects on them, but remove it from the machines that are running other projects well. |
AMDave Volunteer moderator Volunteer tester Send message Joined: 30 Aug 11 Posts: 41 Credit: 100,018 RAC: 0 |
I have just had to abort over 500 tasks in the queue on 1 machine all estimated at 3hrs 45min and all due in 2 hours time. Something is still not right |
AMDave Volunteer moderator Volunteer tester Send message Joined: 30 Aug 11 Posts: 41 Credit: 100,018 RAC: 0 |
The 500+ WUs that I aborted from 1 host this morning did not get on-forwarded to any other hosts. The result shows: max # of error/total/success tasks 1, 1, 1 errors Too many total results with limits of 1, they did not get re-generated and re-queued. Is this intentional? |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 739 Credit: 17,612,101 RAC: 10 |
Yes this is intentional. The workunits are fetched from the factordb, as number who need to be factored. If they are resent in the Boinc system they might get be factored somehow else in the factordb. Than the Boinc result is a waste of computing power. If they are not yet factored in the meantime, I will fetch them again later during workunit generation. |
AMDave Volunteer moderator Volunteer tester Send message Joined: 30 Aug 11 Posts: 41 Credit: 100,018 RAC: 0 |
Thanks Yoyo. |
[AF>Le_Pommier] Aillas Send message Joined: 7 Sep 11 Posts: 12 Credit: 581,331 RAC: 0 |
AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h. Not quite true. I have 2 WU that are running now since 16 hours each on a Core i7 3Ghz and they are not finished. In my account, both are marked as "Timed out - no response". I hope you will grant credit for those WU. I would not like to loose more than 30 hours of computing. Regards |