Too Late To Validate

Message boards : Number crunching : Too Late To Validate
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Chris Granger

Send message
Joined: 2 Sep 11
Posts: 3
Credit: 34,381
RAC: 0
Canada
Message 39 - Posted: 3 Sep 2011, 0:21:38 UTC

Are the deadlines for work units really short for this project? My Linux machine was throwing up errors, so I attached with a slower Windows machine but finished my first work unit as invalid, with the error "Completed, too late to validate"...

This isn't the fastest machine around (2GHz dual core) but neither is it the slowest. Can you look into this and see if something is wrong?
ID: 39 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave
Volunteer moderator
Volunteer tester

Send message
Joined: 30 Aug 11
Posts: 41
Credit: 100,018
RAC: 0
Australia
Message 41 - Posted: 3 Sep 2011, 1:28:06 UTC

The yafu deadline is so short that the BOINC Manager is preventing any other project from running while there is yafu work in the queue.

Please extend the yafu deadline to at least 3 days, as soon as possible.
ID: 41 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 736
Credit: 17,612,101
RAC: 38
Germany
Message 50 - Posted: 3 Sep 2011, 8:23:46 UTC

Hello,

I will not extend the deadline. I need the results fast.
Yes if a workunit is downloaded it is started immediate by Boinc.
But I also let YAFU running and other projects running very well between YAFU workunits. Boinc doesn't download new YAFU workunits if all er finished. First other projects get cpu time according there ressource share. Only if YAFU ressource share is endangered Boinc downloads new workunits.
I use nearly the latest Boinc version.
ID: 50 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris Granger

Send message
Joined: 2 Sep 11
Posts: 3
Credit: 34,381
RAC: 0
Canada
Message 52 - Posted: 3 Sep 2011, 11:25:37 UTC

So a 2GHz machine is too slow to be useful? OK, detaching my Windows box...
ID: 52 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bellialuss

Send message
Joined: 3 Sep 11
Posts: 1
Credit: 7,867
RAC: 0
Poland
Message 53 - Posted: 3 Sep 2011, 11:48:42 UTC - in response to Message 52.  

So a 2GHz machine is too slow to be useful? OK, detaching my Windows box...

Not only your machine :D I have 2x 3.5 GHz and have same message...workunit should crounch 25min and now have 40min and remaining time "--" bug?
ID: 53 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 29 Aug 11
Posts: 38
Credit: 13,384,348
RAC: 0
United States
Message 54 - Posted: 3 Sep 2011, 15:14:30 UTC

I think there are a couple of problems here.

1) The deadline is way too short. Needs a few days added at least.

2) The first tasks were very short, compared to what is being issued now. So BOINC learned how short the tasks were, and then over-downloaded too many tasks. No way to finish them in time. DCF will correct for this over time.
Reno, NV
Team: SETI.USA
ID: 54 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 736
Credit: 17,612,101
RAC: 38
Germany
Message 57 - Posted: 3 Sep 2011, 17:24:02 UTC

AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h.
ID: 57 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ChertseyAl

Send message
Joined: 29 Aug 11
Posts: 7
Credit: 11,046
RAC: 0
United Kingdom
Message 62 - Posted: 3 Sep 2011, 19:05:37 UTC - in response to Message 50.  


I will not extend the deadline.


Good luck with that.

I was going to post a good old rant about BOINC being about cooperation between the project and the volunteers, and how volunteers are pretty much the only thing that keeps the project alive, supplying processor power, paying for their own equipment, paying for the electricity, etc etc. But I won't.

Will I continue to crunch this project? Probably. At least until I've hit the MM I'm looking for. As will many others I'm sure. Once the hardcore Milestoners are sated I'm not sure who is going to crunch a project that so aggressively takes over resources.

Whatever, keep on crunchin' guys! ;)

Al.
ID: 62 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 29 Aug 11
Posts: 38
Credit: 13,384,348
RAC: 0
United States
Message 63 - Posted: 3 Sep 2011, 19:16:24 UTC - in response to Message 57.  
Last modified: 3 Sep 2011, 19:17:55 UTC

AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h.



This prevents a volunteer to download even 1/2 a day's worth of work. Not everyone is connected to the internet 24/7. Due dates should be no less than 24 hours. 48 to 72 is much shorter than most projects. 1-2 weeks is standard.

Also, the server run time estimate is way wrong. It is underestimating run time, and causing voluteer's clients to download too much work. And then they are late and don't get credit for work done.
Reno, NV
Team: SETI.USA
ID: 63 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pete Broad

Send message
Joined: 1 Sep 11
Posts: 1
Credit: 59,491
RAC: 0
United Kingdom
Message 64 - Posted: 3 Sep 2011, 20:02:39 UTC

Yeah, deadlines are too tight. Attached a couple of machines and they're both struggling to get the work back in time, I've noticed they quite often remain at 100% completed for an hour or more!


Pete
ID: 64 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile STE\/E

Send message
Joined: 2 Sep 11
Posts: 4
Credit: 21,550,064
RAC: 20,387
United States
Message 65 - Posted: 3 Sep 2011, 20:42:22 UTC
Last modified: 3 Sep 2011, 20:45:09 UTC

Setting your Preferences to anything other than .1 & you run the risk of not getting the Wu's back in time. Not many GPU users are going to stand for that very long as most like to carry more work than that for their GPU's. I haven't received any new work from any of the other CPU Projects since I started running YUFA even though their all set to the same Resource Share because YUFA just takes over everything. Not Nice at all ... Steve*
ID: 65 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 736
Credit: 17,612,101
RAC: 38
Germany
Message 66 - Posted: 3 Sep 2011, 20:52:26 UTC - in response to Message 63.  

Also, the server run time estimate is way wrong. It is underestimating run time, and causing voluteer's clients to download too much work.

I increased the estimates for new workunits.
ID: 66 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 736
Credit: 17,612,101
RAC: 38
Germany
Message 67 - Posted: 3 Sep 2011, 20:57:32 UTC - in response to Message 65.  

I haven't received any new work from any of the other CPU Projects since I started running YUFA even though their all set to the same Resource Share because YUFA just takes over everything. Not Nice at all ... Steve*

New Boinc client handles this very well. I even run YAFU in parallel with other projects. If there was to much YAFU work done and to less for the other projects, than Boinc doesn't download more YAFU work. At least on my i5 with a nearly up2date Boinc version.

I need the results fast. If I wait some days, than there is a high possibility, that the number was already factored and the Boinc factorisation isn't needed anymore. So it was a waste of computing.
ID: 67 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Cruncher Pete

Send message
Joined: 28 Aug 11
Posts: 6
Credit: 29,599,009
RAC: 416
Australia
Message 68 - Posted: 3 Sep 2011, 21:56:14 UTC - in response to Message 67.  

I also feel that there is something wrong with the set up as is. I am running a number of fast machines, i7 980's, the latest BOINC Version and am not running any other projects at the same time. Running 12 threads, it does not complete in time. Please check the number of my WU's submitted that says that it error-ed out, NO CREDIT. Tom needs results in a hurry, we are trying our best but at least give us credit for our efforts...
ID: 68 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave
Volunteer moderator
Volunteer tester

Send message
Joined: 30 Aug 11
Posts: 41
Credit: 100,018
RAC: 0
Australia
Message 69 - Posted: 3 Sep 2011, 23:39:42 UTC - in response to Message 66.  
Last modified: 4 Sep 2011, 0:05:27 UTC

I increased the estimates for new workunits.

Excellent!
That helps a lot.
Thank you.

BTW - are you saying that BOINC 6.12.x manages yafu tasks with other projects better than 6.10.x client?

I have 2 identical machines, 1 now with 6.12.33 and the other with 6.10.58. I will run them side by side for a day.

Akk. The 6.12.33 machine just downloaded 13 tasks.
A few estimated at around 3.5 hrs, the rest are estimated at 1.1 hours.
My General setting is to cache only 4 hours.
Some of these will not make the 10 hour deadline if the client is to do any of the other project work as well.
Something is amiss.
I'll stop the other WUs and get as much Yafu returned on that machine as possible before the deadline.

The 6.10.53 machine is operating ok-ish, but I just lost 2 consecutive 24 hours-WUs to computation error on another project on this machine since I put yafu on.
That could be coincidence, but I am not sure my sociability testing is going very well.
If the next WU on that project fails without yafu on it I can eliminate yafu as the influence.

For the time being, I will restrict yafu to dedicated machines without other projects on them, but remove it from the machines that are running other projects well.
ID: 69 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave
Volunteer moderator
Volunteer tester

Send message
Joined: 30 Aug 11
Posts: 41
Credit: 100,018
RAC: 0
Australia
Message 100 - Posted: 7 Sep 2011, 20:56:35 UTC

I have just had to abort over 500 tasks in the queue on 1 machine all estimated at 3hrs 45min and all due in 2 hours time.

Something is still not right
ID: 100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave
Volunteer moderator
Volunteer tester

Send message
Joined: 30 Aug 11
Posts: 41
Credit: 100,018
RAC: 0
Australia
Message 101 - Posted: 8 Sep 2011, 9:10:03 UTC

The 500+ WUs that I aborted from 1 host this morning did not get on-forwarded to any other hosts.

The result shows:
max # of error/total/success tasks	1, 1, 1
errors	 Too many total results

with limits of 1, they did not get re-generated and re-queued.

Is this intentional?
ID: 101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 736
Credit: 17,612,101
RAC: 38
Germany
Message 103 - Posted: 8 Sep 2011, 11:04:26 UTC - in response to Message 101.  

Yes this is intentional.
The workunits are fetched from the factordb, as number who need to be factored.
If they are resent in the Boinc system they might get be factored somehow else in the factordb. Than the Boinc result is a waste of computing power.
If they are not yet factored in the meantime, I will fetch them again later during workunit generation.
ID: 103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave
Volunteer moderator
Volunteer tester

Send message
Joined: 30 Aug 11
Posts: 41
Credit: 100,018
RAC: 0
Australia
Message 106 - Posted: 8 Sep 2011, 12:28:15 UTC - in response to Message 103.  

Thanks Yoyo.
ID: 106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>Le_Pommier] Aillas

Send message
Joined: 7 Sep 11
Posts: 12
Credit: 581,331
RAC: 0
France
Message 114 - Posted: 9 Sep 2011, 12:54:55 UTC - in response to Message 57.  

AFAIR the deadline is 10h and a workunit (not the fast ones) runs ~1h.


Not quite true.

I have 2 WU that are running now since 16 hours each on a Core i7 3Ghz and they are not finished.

In my account, both are marked as "Timed out - no response".

I hope you will grant credit for those WU. I would not like to loose more than 30 hours of computing.

Regards
ID: 114 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Too Late To Validate




Datenschutz / Privacy Copyright © 2011-2024 Rechenkraft.net e.V. & yoyo