Posts by UBT - Timbo

1) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1437)
Posted 18 Aug 2019 by UBT - Timbo
Post:
In the last phase 8 gnfs jobs are started for roughly 45 minutes. Afterwards the results of the 8 gnfs jobs are put into nfs.dat.
Then yafu.exe runs to check if it is enough gnfs. If not again 8 gnfs jobs are started and so on.
If it is enough, yafu.exe computes the result. This is the single run at the end which should not take longer than 2 hours.


Hi yoyo

Thanks for the explanation - very helpful to understand how the project works and what one should expect to happen. :-)

But I see that gnfs runs single threaded over many days. So not 8 gnfs jobs are running in parallel, only 1. This is also the explanation for running many days.
As you restarted BM the gnfs phase was already over and just the combining was done.

yoyo


The particular PC has run many other projects using multiple threads without any issues...BUT, I will mention that it hasn't been switched off (ie hard reboot) for some time, so it could be possible that somehow the Yafu application and tasks are not perhaps behaving with this piece of hardware and that there could be a hardware issue.

There is even the possibility that the single threaded yafu task actually crashed and hence was unresponsive - with the BM restart actually getting Yafu to complete it's task on time.

However, since this task ended, I am now running 16 seperate tasks for SETI@home (for the WOW! event challenge) and every task is completing correctly and is being validated too.

So, I would assume that the vcores are working OK, but somehow, the Yafu application might have had an issue with the hardware (which is a Xeon E5-2670 CPU, with 8Gb RAM and WIn 7 Pro).

I will try a couple of other Yafu tasks and monitor them closely to see if they fully use the CPU(s) they need during the processing.

regards and thanks for your kind assistance.
Tim
2) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1435)
Posted 17 Aug 2019 by UBT - Timbo
Post:
Hi yoyo

I just checked this PC just now and the task HAS now complted, and been uploaded and is validated.

Total "run time" (on the task list): 311,354.77 seconds...that is claimed to be about 86.5 hours...although I know it took over 7 days, before I re-started BM.

It'll be interesting to see what you make of the stderr.log... ;-)

regards
Tim
3) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1434)
Posted 17 Aug 2019 by UBT - Timbo
Post:
Further info, left off last msg.

yafu.exe is claiming 6% (ie 1/16th of the 16 CPUs available)...so it is clearly doing "something" but what it is doing I have no idea :-(

There is just under 1 hour to go until the deadline is reached and the progress is now 99.984%.

I cannot check on the PC when the deadline occurs, so I will check again in the morning, but I suspect that it will fail, due to the deadline expiring.

If that is the case, then hopefully, you will see a report in your logs about what caused it to fail?

regards
Tim
4) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1433)
Posted 17 Aug 2019 by UBT - Timbo
Post:
That is a behaviour which I didn't saw before.
Is there only one gnfs process running? It should be 8 of them.


Hi

There are NO gnfs prcoesses running

All that are running are:

yafu.exe - 408,128k
yafuwrapper_26014_windows_x86_64.exe - 3,956k

So, given that perhaps gnfs is supposed to be running I took a gamble and shut down BM and restarted it....

The elapsed time has now gone back to 3 days 11 hrs and 13 mins (was 7 days) and the percentage completed is now 99.978% (was 100%).

And even now, gnfs is not running , assuming it should be?.

regards
Tim
5) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1431)
Posted 17 Aug 2019 by UBT - Timbo
Post:
Files in the slot directory are still changing?


Hi yoyo

Yup - still updating - last update was 1 hr 12 mins ago.

nfs.dat is now at 2.5Gb in size... !!

regards
Tim
6) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1429)
Posted 17 Aug 2019 by UBT - Timbo
Post:
I would let it run, since nfs.dat is only 3 hours old and ggnfs was recently updated.
I assume that nfs is in it's last phase combining everything, which is single threaded.


Hi yoyo

Thanks again.

It is still "running" and the deadline (according to BOINC Manager) is 13 Aug 2019, 23:20 UTC.

So I hope thet the extra 5 days "allowance" for tasks that do not complete by the deadline, is not a "fixed" amount, as the "BOINC Manager" deadline + 5 days brings it to 18 Aug 00:20 BST - which is just after midnight tonight !!

It seems that this one task is going to be problematic...but I don't want to restart BOINC Manager as this task *could* finish before the deadline and all would be well.

But I do not have any confidence that restarting BM will work and it could just restart from the last checkpoint, and lots of time will have been lost...and there is no guarantee that it will actually complete and validate even after that.

So, the clock is running - just over 12 hours to go and we will see what happens !!

PS: I have now shut down all other processing on this PC, so that the other 15 HT cores can be freed up to allow this task to complete and hence no CPU cycles will be doing any other tasks. So, even if this task is now running single threaded, the other 7 "real" CPUs are not doing anything else to limit this one Yafu-8T task from completing.

PPS: One annoying thing about these multi-CPU Yafu tasks, is if they are running single threaded at the end, why can't the other 7 cores be "freed up" so that other tasks can be running. In this case, this single Yafu-8T task could have been running using just one CPU for maybe 3 or 4 days and so the other 7 cores have been idle and not doing anything ?

regards
Tim
7) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1427)
Posted 16 Aug 2019 by UBT - Timbo
Post:
The percentage is just a artifical estimation by BOINC. It doesn't has anything todo with real completion. Important is if nfs.dat or factor.log are still changing.

Single threaded run for 24 hours is very very strange. It shouldn't be much more than 2 hours.


Hiya

So, I checked and nfs.dat is now 1.6Gb in size and was last updated 3 hours ago (and not every 45 mins or so which I read somewhere else on this message board).

factor.log is about 5.6kb and was updated at the same time as was wrapper_checkpoint.txt. There is also ggnfs.log that was updated about one minute before these three.

Whether it is strange or not, it is happening...

I just need to know whether to kill it or leave it ?

regards
Tim
8) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1425)
Posted 15 Aug 2019 by UBT - Timbo
Post:
If you hav not changed it, the 8t uses 8 cores. But, at the end (and this is valid for all yafu apps) it runs single threaded. This can take (I would estimate for 8t) 1 hour. But the good thing is that the task is near it's end.


Hi yoyo

Thanks for the info.

I've not changed anything - the host PC has been 100% active and powered up at all times over the last week or so.

OK, so it has been running as a single thread for some time now...(as I have been checking the CPU via Task Manager for at least a day) and it seems to have been running as a single thread for at least 24 hours, if not more.

I've been checking the progress and this is how it has been noted:

99.939% as of 9pm Tuesday 13th Aug
99.976% as at 6:30am Weds 14th Aug
99.982% as at 9:28am Weds 14th Aug - just 53 secs to go
99.988% as at 1pm Weds 14th Aug
99.990% as at 3pm Weds 14th Aug...and 31 secs to go - Elapsed time 3 days 19 hrs 16 mins
99.994% as at 9pm Weds 14th Aug - now just 18 secs to go
99.998% as of 11am Thurs 15th Aug - now just 5 secs to go

Currently 99.999% as of 10:30pm Thurs 15th - with 1 sec to go - Elapsed Time: 5 days 2 hrs 44 mins.

I understand that the "time to completion" is a BOINC Manager estimate...but even so, it is very strange that this should be taking so long.

The task was downloaded on 9 Aug 2019, 22:43:06 UTC and the deadline is 17 Aug 2019, 23:20:38 UTC - so hopefully with about 48 more hours to go, it will finish within that time...if not, it'll be a huge waste of resources - as it'll be over 7 days of crunching and using (up to) 8 cores.

regards
Tim
9) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1423)
Posted 15 Aug 2019 by UBT - Timbo
Post:
Update:

I should point out that the CPU is a 16 Core Xeon and as such, the fact that the Yafu-8T task is using just 6% (or 1/16th) of the available CPU's does seem rather strange, as it should (in theory) be using about 50% of the CPU's.

regards
Tim
10) Message boards : Number crunching : Long running 8T task - nearly 5 days !! (Message 1422)
Posted 15 Aug 2019 by UBT - Timbo
Post:
Hi
I need some feedback please.

I have been running this 8T task:

https://yafu.myfirewall.org/yafu/result.php?resultid=4986736

and so far it has been running 4 days 19 hours and is at 99.999%.

The nfs.dat and factor.log files are still updating so should I keep this running...?

It seems an awfully long time to process ONE task...even if it is using 8 cores but CPU usage is only 6% :-(

Thanks in advance
Tim
11) Message boards : Number crunching : Diffrence between app "YAFU 134.05 (mt)" and app "YAFU-4t 134.05 (4t)" (Message 1127)
Posted 19 Feb 2018 by UBT - Timbo
Post:
I'm interested only if they are 2 or more days over the deadline.


Hi

Maybe you should take a look at this thread:

https://yafu.myfirewall.org/yafu/forum_thread.php?id=297

regards
Tim
12) Message boards : Number crunching : Stuck at 100% for nearly 6 days !! (Message 1110)
Posted 13 Feb 2018 by UBT - Timbo
Post:
Hi

I've had a Yafu 4T task at 100% for nearly 6 days now...and obviously the deadline was passed some time ago.

I left it running as when people reported previous issues, the general response was to leave it as credit is still earned upto 10 days past the deadline.

I checked my tasks and it seems the WU had already "errored" although this was not advised to me and the task was still using up my computer processing time.

This was the report from my tasks:

Task Computer Sent Time Status Run time (sec) CPU time (sec) Credit Application
1766053 10745 6 Feb 2018, 13:31:05 UTC 13 Feb 2018, 13:31:05 UTC Timed out - no response 0.00 0.00 --- YAFU-4t v134.05 (4t) windows_intelx86

and this is the work unit:

https://yafu.myfirewall.org/yafu/result.php?resultid=1766053

Stderr report doesn't show anything (good or bad)....so I've aborted the task now. The PC is running fine and is crunching other tasks without any issues...Not sure why this Yafu task would have an issue or why, if it errored it was not "deleted" from the PC.

regards
Tim







Datenschutz / Privacy Copyright © 2011-2024 Rechenkraft.net e.V. & yoyo