Behaviour of MT Threaded Application

Message boards : Number crunching : Behaviour of MT Threaded Application
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Conan
Avatar

Send message
Joined: 5 Sep 11
Posts: 46
Credit: 7,102,856
RAC: 3,093
Australia
Message 310 - Posted: 24 Jan 2012, 16:04:59 UTC

On my Linux computers running Fedora 16 (64 Bit), I have noticed that while the Resources use shows from 89% to 99% and all cores in use, the Process use is only 4% per core on my 4 core machine and 6% per core on my 6 core machine.

This then jumps to 96% to 100% and 1 core in use (with all other cores still held but not being used) when the WU is finishing up the calculation.
This then goes to 84% to 100% and all available cores again when the gnfs sieve part of the WU kicks in.

So for the first stage all cores are grabbed but the computer is doing very little but in the second stage the computer is fully utilised.

I would assume that the Windows WU does a similar thing.

Just some observations.

Conan

ID: 310 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 5 Sep 11
Posts: 46
Credit: 7,102,856
RAC: 3,093
Australia
Message 311 - Posted: 24 Jan 2012, 16:35:23 UTC

I still can't get a WU to finish on my AMD Phenom X6 Linux 64 bit computer.

I am now getting File transfer errors and 'ecm' command errors.

Boinc messages shows File Size : 103866468.000000 Limit : 5000000.000000.

See 639963
and 640058

So my file size does not match what it should be and this is failing the WU.

Is there a cure for this ?

Conan
ID: 311 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 312 - Posted: 25 Jan 2012, 0:01:05 UTC

Seems that your Boinc hasn't downloaded the ecm binary, which is here http://yafu.dyndns.org/yafu/download/ecm-6.2.1_x86_64-pc-linux-gnu and exactly 536768 bytes.

Can you start a WU, stop it and send me the content of the slot to
yoyo (a) mailueberfall . de?

Seems that there is no ecm binary there.

yoyo
ID: 312 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 313 - Posted: 25 Jan 2012, 1:19:28 UTC

I found a possible issue regarding the "ecm command not found" and created a new linux 64 version.
yoyo
ID: 313 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 5 Sep 11
Posts: 46
Credit: 7,102,856
RAC: 3,093
Australia
Message 314 - Posted: 25 Jan 2012, 3:17:56 UTC

Right, thanks YOYO.
I have detached and reattached to the project just to make sure that I have downloaded the latest version.
The "ecm" binary is in the project directory on both computers, I did not check to see if it was there before.

Have downloaded another WU on both machines so will now run them and see how they go.

Conan
ID: 314 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 5 Sep 11
Posts: 46
Credit: 7,102,856
RAC: 3,093
Australia
Message 315 - Posted: 25 Jan 2012, 4:44:49 UTC

Thanks YOYO the application is working now and I have validated and successful results.

Reading the Boinc messages it reads that the app found "negative CPU", what does that mean?

Conan
ID: 315 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 25 Jan 12
Posts: 23
Credit: 1,529,974
RAC: 0
New Zealand
Message 316 - Posted: 25 Jan 2012, 8:42:07 UTC

I'm aware this ap is very new. Are there plans to have the progress % complete while the task is processing?
ID: 316 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 317 - Posted: 25 Jan 2012, 10:24:53 UTC - in response to Message 315.  


Reading the Boinc messages it reads that the app found "negative CPU", what does that mean?

Don't know what this really means. I saw it also.
yoyo
ID: 317 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 318 - Posted: 25 Jan 2012, 10:25:31 UTC - in response to Message 316.  

I'm aware this ap is very new. Are there plans to have the progress % complete while the task is processing?

There is currently now way to show a progress bar.
yoyo
ID: 318 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 25 Jan 12
Posts: 23
Credit: 1,529,974
RAC: 0
New Zealand
Message 321 - Posted: 25 Jan 2012, 23:49:59 UTC - in response to Message 318.  

I'm aware this ap is very new. Are there plans to have the progress % complete while the task is processing?

There is currently now way to show a progress bar.
yoyo

Thanks for your prompt answer
ID: 321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krzychu P.

Send message
Joined: 23 Sep 11
Posts: 3
Credit: 52,276
RAC: 0
Poland
Message 334 - Posted: 3 Feb 2012, 11:04:24 UTC

I'll wirte it here.

Could you take a look at this WU:
yafu_C104_1328030114_495_0

Something went wrong at the end of computing, cause it finished with error.
In the meantime it checkpointed in the range of 30 to 90 minutes.

What should I do, to finish such WUs correctly? (now I have similar long-running WU)
ID: 334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krzychu P.

Send message
Joined: 23 Sep 11
Posts: 3
Credit: 52,276
RAC: 0
Poland
Message 337 - Posted: 8 Feb 2012, 9:39:41 UTC - in response to Message 334.  

I don't know where my last post is, so I'll write again.

I'll wirte it here.

Could you take a look at this WU:
yafu_C104_1328030114_495_0

Something went wrong at the end of computing, cause it finished with error.
In the meantime it checkpointed in the range of 30 to 90 minutes.

What should I do, to finish such WUs correctly? (now I have similar long-running WU)


Once again the same situation as above.
WU finished with (Stderr output):
linear algebra completed 261019 of 261780 dimensions (99.7%, ETA 0h 0m)    
app exit status: 0xff
14:45:09 (752): called boinc_finish


On the WU's page I found:
Client state	Compute error
Exit status	195 (0xc3) EXIT_CHILD_FAILED


Could you fix it? I've reseted project in my boinc manager, but it didn't help.
ID: 337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Krzychu P.

Send message
Joined: 23 Sep 11
Posts: 3
Credit: 52,276
RAC: 0
Poland
Message 364 - Posted: 2 Mar 2012, 15:33:39 UTC

And again something is wrong:
Task 658619
http://yafu.dyndns.org/yafu/result.php?resultid=658619
ID: 364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 366 - Posted: 2 Mar 2012, 21:00:45 UTC - in response to Message 364.  
Last modified: 2 Mar 2012, 21:01:10 UTC

Do you have still the Boinc log if this?
The result claims that the maximum disc usage was exeeded. The maximum is set to 2GB. I wonder if this was really exeeded.
yoyo
ID: 366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 2 Sep 11
Posts: 6
Credit: 2,081,426
RAC: 0
Germany
Message 368 - Posted: 3 Mar 2012, 16:19:02 UTC - in response to Message 366.  

you really think anything inside that:

<result_name>yafu_C106_1330635007_173_0</result_name>
<checkpoint_cpu_time>2205.109000</checkpoint_cpu_time>
<checkpoint_elapsed_time>7130.078125</checkpoint_elapsed_time>
<fraction_done>0.000000</fraction_done>

as it should be?

talking about a job running in MT4 mode now for 140 minutes...
ID: 368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 369 - Posted: 3 Mar 2012, 17:02:44 UTC

No, I mean in the Boinc stdout log. There should be something stated, that the maximal disc was exeeded.
ID: 369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 2 Sep 11
Posts: 6
Credit: 2,081,426
RAC: 0
Germany
Message 370 - Posted: 3 Mar 2012, 17:40:54 UTC - in response to Message 369.  
Last modified: 3 Mar 2012, 17:41:50 UTC

No, I mean in the Boinc stdout log. There should be something stated, that the maximal disc was exeeded.


i know!

i'm not at that point yet - 460 MB currently, 03:38:50 done, boinc not showing any sign of a checkpoint and still sitting there at 0%. :(

that chain of different apps being run is freaky!

and the wrapper does definitely not pick up the correct runtime.

back to the lab..
ID: 370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Viking69
Avatar

Send message
Joined: 16 Feb 12
Posts: 12
Credit: 107,382
RAC: 0
United States
Message 374 - Posted: 6 Mar 2012, 2:52:34 UTC - in response to Message 321.  

I'm aware this ap is very new. Are there plans to have the progress % complete while the task is processing?

There is currently now way to show a progress bar.
yoyo

Thanks for your prompt answer


Yoyo, is there a reason as to why this is not possable?
ID: 374 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile yoyo_rkn
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 22 Aug 11
Posts: 734
Credit: 17,574,526
RAC: 510
Germany
Message 375 - Posted: 6 Mar 2012, 6:00:20 UTC

The cpu time is NOT used to calculate the credit in creditnew.

But anyway, I update the wrapper in version 130.02 and now it seems a bitter better with the cpu time. But as said, you gain nothing by a correct cpu time, thos is NOT used to calculate the credits.

yoyo
ID: 375 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Behaviour of MT Threaded Application




Datenschutz / Privacy Copyright © 2011-2024 Rechenkraft.net e.V. & yoyo