Client Detached Issue again!


Advanced search

Message boards : Number crunching : Client Detached Issue again!

AuthorMessage
kb*
Send message
Joined: Nov 22 11
Posts: 7
Credit: 562,665
RAC: 0
Message 857 - Posted 17 Feb 2016 21:43:32 UTC

    Last modified: 17 Feb 2016 22:30:15 UTC

    Work units that have been processed and uploaded successfully give the following error when examined on the SAT server:

    "Client detached"

    This error message is appearing before the time limit for a work unit has expired.

    The work unit appears on the server with this status, less than a day after having been downloaded and processed.

    Would someone kindly advise?

    Trying to get my head around this problem!!!

    It may be related to where work units were downloaded and "Suspend Network Activity" was selected from the Boinc Manager "Activity" Menu.

    NOTE - The work units were downloaded, processed And reported before the stated deadline!

    Profile PDW
    Send message
    Joined: Jul 13 14
    Posts: 5
    Credit: 2,616,855
    RAC: 0
    Message 858 - Posted 18 Feb 2016 16:45:30 UTC - in response to Message 857.

      Are you sure they are yours ?

      Looking at your tasks you only have 6 that have errors and they show unhandled exceptions in the logs. None show as Client Detached.

      You have 0 invalid tasks so they aren't there either !

      Profile AriZonaMoon*
      Send message
      Joined: Oct 19 11
      Posts: 3
      Credit: 597,193
      RAC: 0
      Message 859 - Posted 18 Feb 2016 17:57:19 UTC

        Last modified: 18 Feb 2016 17:58:04 UTC

        Nope.. they are mine.. ;-)

        Please check. It was quite a surprise
        that they were denied llike this.

        Profile PDW
        Send message
        Joined: Jul 13 14
        Posts: 5
        Credit: 2,616,855
        RAC: 0
        Message 860 - Posted 18 Feb 2016 18:12:55 UTC - in response to Message 859.

          Cool, you have someone to type messages for you ;-)

          I am not project admin, I only asked because I knew I had a load of Client Detached errors. However I know all mine occurred on a single box and I believe there was a valid reason it got messed up and hence detached. The Boinc stderr file filled the disk (the file was up to 940Gb in size), Boinc and system ran out of disk space and weird and wonderful things happened :-o

          Hopefully a project admin will reply but you might need to try their other forum: http://forum.boinc.ru/default.aspx?g=topics&f=121

          Profile AriZonaMoon*
          Send message
          Joined: Oct 19 11
          Posts: 3
          Credit: 597,193
          RAC: 0
          Message 861 - Posted 18 Feb 2016 18:50:08 UTC - in response to Message 860.

            Last modified: 18 Feb 2016 18:54:07 UTC

            I was busy being annoyed.. so I am glad someone else
            typed - or it would only be som bad words.. ;-p

            I am not too fluent in Russian, and it should be reasonable to
            belive the admins read This message board. Or maybe thats only
            during Christmas..? ;-)

            No space problems should be the issue here. More than enough space -
            and if Boinc comes up with that on its own, its clearly a problem in
            the project or Boinc itself.

            Also, since Boinc Manager has the possibility to stop network, its a
            common and accepted thing - that should not cause a project to deny wu`s.
            They had 7 days time limit when downloaded, and the oldest i uploaded were
            less than 3 days old. The newest less than a day.
            The project has marked a whole bunch of wu`s (at the same moment) with this
            error message - and that has happened several times during the timeframe.
            Its not possible to see a good reason to change status on my wu`s long before
            time has gone out.
            It is not acceptable that the project claims my computer to be online at
            every moment, so that it can abort wu`s during the time frame. If the
            time frame is not correct in the first place, the wu`s shouldnt be given
            such a long time frame as 7 days to begin with.

            Profile AriZonaMoon*
            Send message
            Joined: Oct 19 11
            Posts: 3
            Credit: 597,193
            RAC: 0
            Message 862 - Posted 18 Feb 2016 19:26:32 UTC

              Last modified: 18 Feb 2016 19:29:11 UTC

              I tried to write some russian here... but it shows to not be allowed.. hehe

              kb*
              Send message
              Joined: Nov 22 11
              Posts: 7
              Credit: 562,665
              RAC: 0
              Message 863 - Posted 18 Feb 2016 19:35:01 UTC

                Is there a SAT project Admin out there?

                Unfortunately we cannot access the alternative forum suggested by PDW due to the language issue.

                Profile PDW
                Send message
                Joined: Jul 13 14
                Posts: 5
                Credit: 2,616,855
                RAC: 0
                Message 864 - Posted 18 Feb 2016 19:35:17 UTC - in response to Message 862.

                  Okay, I posted in their other forum asking for someone to look.
                  My russian has improved but I still use english :-)

                  kb*
                  Send message
                  Joined: Nov 22 11
                  Posts: 7
                  Credit: 562,665
                  RAC: 0
                  Message 865 - Posted 18 Feb 2016 19:58:51 UTC - in response to Message 864.

                    Thank you PDW!

                    Profile PDW
                    Send message
                    Joined: Jul 13 14
                    Posts: 5
                    Credit: 2,616,855
                    RAC: 0
                    Message 866 - Posted 18 Feb 2016 20:40:31 UTC - in response to Message 865.

                      Last modified: 18 Feb 2016 20:41:44 UTC

                      No problem, but Admin is tucked up in bed (it's 2:20am in Irkutsk, UTC+8) so will be a few hours more before you get a reply.

                      Edit: Which reminds me the server time is wrong by 2 hours and 17 minutes, can this be fixed please ?

                      Profile Oleg Zaikin [SAT@home]
                      Forum moderator
                      Project administrator
                      Project developer
                      Project scientist
                      Send message
                      Joined: Sep 15 11
                      Posts: 133
                      Credit: 4,826,453
                      RAC: 0
                      Message 867 - Posted 18 Feb 2016 20:57:08 UTC - in response to Message 857.

                        Work units that have been processed and uploaded successfully give the following error when examined on the SAT server:

                        "Client detached"

                        This error message is appearing before the time limit for a work unit has expired.

                        The work unit appears on the server with this status, less than a day after having been downloaded and processed.

                        Would someone kindly advise?

                        Trying to get my head around this problem!!!

                        It may be related to where work units were downloaded and "Suspend Network Activity" was selected from the Boinc Manager "Activity" Menu.

                        NOTE - The work units were downloaded, processed And reported before the stated deadline!


                        I see several tasks with "error while computing" status of your host, for example http://sat.isa.ru/pdsat/result.php?resultid=18488698. After about 8 of such tasks new tasks with status "Client detached" starts to appear. So I suppose that project thinks that something wrong with your host. By the way, all tasks which crashed on your host with "error while computing" were successfully processed by another hosts. Example is here http://sat.isa.ru/pdsat/workunit.php?wuid=8567212. Do you have "error while computing" status in any another BOINC project? Maybe some problems with OS or with memory?

                        PS. PDW, thank you for the message on boinc.ru.

                        kb*
                        Send message
                        Joined: Nov 22 11
                        Posts: 7
                        Credit: 562,665
                        RAC: 0
                        Message 868 - Posted 18 Feb 2016 21:23:55 UTC - in response to Message 867.

                          Hi Olag,

                          Thank you for responding.

                          The work units in question are not mine but those of "AriZonaMoon*

                          Could you kindly investigate further.

                          hoarfrost
                          Forum moderator
                          Volunteer developer
                          Volunteer tester
                          Send message
                          Joined: Oct 11 11
                          Posts: 10
                          Credit: 4,214,224
                          RAC: 0
                          Message 869 - Posted 19 Feb 2016 10:39:06 UTC

                            Last modified: 19 Feb 2016 10:40:23 UTC

                            Usual scenario to get "Client detached problem":

                            1. Participant attach to project new BOINC instance from host "A" and get a bunch of WU's;
                            2. BOINC instance stopped for move to another computer for processing;
                            3. Participant goes to (1) for get a new bunch of WU'S.

                            When you attach to project a new BOINC instance from host that already contacted with project server, all WU's, associated with BOINC instance, contacted with project recently - marked as client detached.

                            I know only one solution of this problem - creation a stable set of BOINC instances, attached to project in short period and after 2-3 "roundtrips" with server each BOINC instance get a stable ID and works fine. But any attach from this host can "reset" WU's in last contacted BOINC instance.

                            Happy Crunching!

                            kb*
                            Send message
                            Joined: Nov 22 11
                            Posts: 7
                            Credit: 562,665
                            RAC: 0
                            Message 870 - Posted 19 Feb 2016 15:15:03 UTC - in response to Message 869.

                              Last modified: 19 Feb 2016 16:00:42 UTC

                              Thanks for your input Hoarfrost, though I would not describe this as an unusual scenario.

                              None of the work units is ever moved to another computer for processing though. The Wu's are simply downloaded on a boinc instance, "Network activity" is suspended and the WU's are crunched where they were downloaded.
                              They are only reported when "Network activity" is resumed. It is at this point that they appear to be flagged as "Host Detached" by the SAT server(s)

                              This process, is one that I have employed over numerous other projects without experiencing this particular outcome.

                              kb*
                              Send message
                              Joined: Nov 22 11
                              Posts: 7
                              Credit: 562,665
                              RAC: 0
                              Message 871 - Posted 19 Feb 2016 17:12:00 UTC - in response to Message 867.

                                Work units that have been processed and uploaded successfully give the following error when examined on the SAT server:

                                "Client detached"

                                This error message is appearing before the time limit for a work unit has expired.

                                The work unit appears on the server with this status, less than a day after having been downloaded and processed.

                                Would someone kindly advise?

                                Trying to get my head around this problem!!!

                                It may be related to where work units were downloaded and "Suspend Network Activity" was selected from the Boinc Manager "Activity" Menu.

                                NOTE - The work units were downloaded, processed And reported before the stated deadline!


                                I see several tasks with "error while computing" status of your host, for example http://sat.isa.ru/pdsat/result.php?resultid=18488698. After about 8 of such tasks new tasks with status "Client detached" starts to appear. So I suppose that project thinks that something wrong with your host. By the way, all tasks which crashed on your host with "error while computing" were successfully processed by another hosts. Example is here http://sat.isa.ru/pdsat/workunit.php?wuid=8567212. Do you have "error while computing" status in any another BOINC project? Maybe some problems with OS or with memory?

                                PS. PDW, thank you for the message on boinc.ru.


                                Hi Oleg, I accept that all the "Error while computing" tasks are genuine errors.

                                It is the "Client detached" errors that are an issue. :-)

                                hoarfrost
                                Forum moderator
                                Volunteer developer
                                Volunteer tester
                                Send message
                                Joined: Oct 11 11
                                Posts: 10
                                Credit: 4,214,224
                                RAC: 0
                                Message 872 - Posted 20 Feb 2016 14:01:46 UTC - in response to Message 870.

                                  None of the work units is ever moved to another computer for processing though. The Wu's are simply downloaded on a boinc instance, "Network activity" is suspended and the WU's are crunched where they were downloaded.

                                  Do you request any other WU's from this computer after receiving first bunch of WU's?

                                  kb*
                                  Send message
                                  Joined: Nov 22 11
                                  Posts: 7
                                  Credit: 562,665
                                  RAC: 0
                                  Message 873 - Posted 22 Feb 2016 21:04:01 UTC - in response to Message 872.

                                    None of the work units is ever moved to another computer for processing though. The Wu's are simply downloaded on a boinc instance, "Network activity" is suspended and the WU's are crunched where they were downloaded.

                                    Do you request any other WU's from this computer after receiving first bunch of WU's?


                                    That is only one scenario where I am seeing this error.

                                    While I have no specific instance that I could point out to you, I appear to have far more "Client detached" errors than the 12 that were downloaded as a test on a second Boinc instance on the same machine.

                                    This same procedure works perfectly with other projects that I have tried.

                                    Post to thread

                                    Message boards : Number crunching : Client Detached Issue again!


                                    Home | My Account | Message Boards


                                    Copyright © 2019 Institute for System Dynamics and Control Theory of SB RAS and Institute for Information Transmission Problems of RAS