Posts by admpicard999

1) Message boards : Number crunching : Work units crash computer and restart at zero (Message 934)
Posted 20 Feb 2017 by admpicard999
Post:
I've had the same issue happen twice now. I wasn't able to write to the message boards before because of the anti-spam requirements. Thankfully one work unit did complete successfully between the two times so I'm able to post now.

With both Work Unit 835263 and 848116, my computer spent the better part of a day (about 10 hours each, I believe) processing the work unit on multiple cores. With the first one, I shut my computer off because a thunderstorm was coming, and when I booted it back up, all nearly 70 hours of computing time (10 hours on 7 cores) was lost because the progress was reset to zero. This more recent one, my computer unexpectedly shut off (this does not happen with any other BOINC projects) and when I turn it back on, the Work Unit is back to 0%. In both these instances I aborted the work unit and it appears online that the work unit was only worked on for a few seconds/minutes instead of many hours on multiple cores.

Is there no checkpointing in this project so it doesn't have to start over when a system restarts? If not, this seems like a serious oversight for work units with 12 hour expected runtimes (my laptop is a bit older, but still gets the job done). Is there some other reason why the work units would be starting over when they were over 70% complete before?







Datenschutz / Privacy Copyright © 2011-2024 Rechenkraft.net e.V. & yoyo