Message boards :
Number crunching :
Tasks won't finish...
Message board moderation
Author | Message |
---|---|
PhilTheNet Send message Joined: 2 Jul 15 Posts: 7 Credit: 457,444 RAC: 0 |
After completing a task I have this message: Tasks won't finish in time BOINC runs 98% of time, computation is enabled 99,9% of that and since boinc refuses to load another task anyone has a solution ? Thks |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 736 Credit: 17,612,101 RAC: 51 |
Which application do you have selected? The yafu-small ones are short runner. |
PhilTheNet Send message Joined: 2 Jul 15 Posts: 7 Credit: 457,444 RAC: 0 |
All apps The app finished on the computer: 3473621 2898239 16 Oct 2018, 4:44:41 UTC 16 Oct 2018, 7:51:08 UTC Terminé et validé 11,154.19 2,615.05 133.53 YAFU v134.05 (mt) windows_x86_64 I have another computer where there is not this problem |
PhilTheNet Send message Joined: 2 Jul 15 Posts: 7 Credit: 457,444 RAC: 0 |
After a reset of the project on the computer a task more loaded and calculated and after same message.... no more task ??????? |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 736 Credit: 17,612,101 RAC: 51 |
How many other projects do you have in this boinc manager and in which state they are? yoyo |
PhilTheNet Send message Joined: 2 Jul 15 Posts: 7 Credit: 457,444 RAC: 0 |
Seti on GPU and 2 UTs WWGrid and 2 cpus free The strange thing is that every time it calculates an UT without problem but refuses to make a second UT with the same configuration, nothing change :( ???? |
PhilTheNet Send message Joined: 2 Jul 15 Posts: 7 Credit: 457,444 RAC: 0 |
Now it's good without having nothing changed in the configuration :) |
Steve Dodd Send message Joined: 12 Oct 16 Posts: 17 Credit: 10,185,129 RAC: 102 |
I have WUs that run at 100% and don't seem to finish - ever. I have a 16T task running over 10.5 hrs. @100% on 23 cores. Similar problem with a 4T WU. I don't know if this is standard behavior or not at the end of the computation. Would not like to lose this much compute time. When I look in Task Manager, there is no CPU time being taken for the Yafu task, but it's holding everything else from running. |
CoolAtchOk Send message Joined: 18 Nov 18 Posts: 2 Credit: 10,378,696 RAC: 0 |
I worked with one task for 7 days and I canceled it finally. ))) https://yafu.myfirewall.org/yafu/result.php?resultid=3783335 I see on my other hosts that many jobs cannot be completed. They have a progress of 100% and load one processor core taking hostage the rest of the cores. IMHO if the WU is allegedly working for more than one day then it is very likely that it will not be able to complete this task should be terminated. |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 736 Credit: 17,612,101 RAC: 51 |
As I can see in the log this wu was still running. It was sent to a different user where it was completed. At the end a WU usually runs only single core. You might check the slot directory if the files there are still changing. I would say at least every 2 hours there will be something written into the files. At the later stages of the wu run is also checkpointing used. The nfs.dat file is the checkpoint file. |
Steve Dodd Send message Joined: 12 Oct 16 Posts: 17 Credit: 10,185,129 RAC: 102 |
Are you saying that if that file is still being written to, that the WU is still running normally? |
yoyo_rkn Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 22 Aug 11 Posts: 736 Credit: 17,612,101 RAC: 51 |
Are you saying that if that file is still being written to, that the WU is still running normally? yes. |
Conan Send message Joined: 5 Sep 11 Posts: 46 Credit: 7,373,200 RAC: 3,734 |
Last night I had to abort a work unit (the first for me I think), as it had stopped doing anything. This WU had been running for nearly 3 days, when I checked the task manager it showed it as being "Stopped" (both java and yafu were in this condition). I suspended then restarted the work unit but it did the same thing and went back to "stopped". It had been holding up 4 cores (of a 4 core machine) for 251,709 seconds but had only had 32,392 CPU seconds of work. The Slot showed no activity. Conan |
Speedy51 Send message Joined: 25 Jan 12 Posts: 23 Credit: 1,529,974 RAC: 0 |
I have a T8 task been running for almost 14 hours 45 minutes is check pointing the .DAT file is increasing in size. I noticed when I started my computer this morning according to the elapsed time in Boinc I had lost approximately 2 hours of running time. I am guessing this means that that processing time had to be redone to get past where I turned the computer off? I'm not sure at what frequency but I know the task is saving more often than every 2 hours. I am aware that I can use hibernation however I am not keen on my computer turning on at night as it is in my room. As I reminds the checkpoint file size is 2.4 GB updated 6 minutes ago TIA for any information |
Speedy51 Send message Joined: 25 Jan 12 Posts: 23 Credit: 1,529,974 RAC: 0 |
Task I was referring to in my previous post had a runtime of 16 hours 49 min 28 sec CPU time was exactly the same. |
[AF] Alliance Francophone Send message Joined: 1 Aug 16 Posts: 2 Credit: 1,269,113 RAC: 2 |
Hello, There's a task running on my computer for 21d 13h 27m and counting. I was away for a fortnight and couldn't check the computer. Since I'm back, I have to suspend it from time to time to let other tasks finish. Now I'm wondering if the task is still doing anything... WU 3765851, my computer is no. 37221. The only file that's updating is graphics_status.xml. It shows now: <cpu_time>1850574.843750</cpu_time> <elapsed_time>1868369.453125</elapsed_time> Then there is init_data.xml, last update June 25, showing: <wu_cpu_time>21986.450000</wu_cpu_time> <starting_elapsed_time>9444.520247</starting_elapsed_time> All other files - except the EXE - are from June 8, the last line of factor.log being: 06/08/19 18:38:11 v1.34.5 @ ANTEC2018WIN7, nfs: commencing lattice sieving with 5 threads Is this task still crunching or should I abort it, like some others did before and after me? |
Speedy51 Send message Joined: 25 Jan 12 Posts: 23 Credit: 1,529,974 RAC: 0 |
Hello, If you are able to I would let it run continuously until the end of 6 July this is 10 days after its deadline which was 27 June. After 10 days I believe you do not receive any credit for it. The choice is up to you |
hsdecalc Send message Joined: 3 Apr 16 Posts: 3 Credit: 2,729,728 RAC: 0 |
Same here: http://yafu.myfirewall.org/yafu/workunit.php?wuid=3918217 I'm number seven of this paket. Running endless since days. Last lines: 07/11/19 20:00:38 v1.34.5 @ CRUNCHER, nfs: previous data file found - commencing search for last special-q 07/11/19 20:00:49 v1.34.5 @ CRUNCHER, nfs: parsing special-q from .dat file 07/11/19 20:00:49 v1.34.5 @ CRUNCHER, nfs: commencing nfs on c117: 191226036716312529204612193243490612051058835202779446558915418869961129767376387041998602418100041559534005184448597 07/11/19 20:00:49 v1.34.5 @ CRUNCHER, nfs: resuming with filtering 07/11/19 20:00:49 v1.34.5 @ CRUNCHER, nfs: commencing lattice sieving with 6 threads Aborting now, time limit reached. A lot of waste time. |
[AF] Alliance Francophone Send message Joined: 1 Aug 16 Posts: 2 Credit: 1,269,113 RAC: 2 |
Task still running, although I had to restart the VM twice. There's a checkpoint at 2h48 and 97.x% (and the task is again at over 1d21h of calculation). But I have just seen that the wingman who got it last night already finished it, so I will abort mine. WU 3765851 |