use local files instead of arguments #1

krtschmr · 2017-09-10T15:24:56Z

panel can be totally ignored since we have the data in all local files available (those are the data that gets reported anyways).

we can simply run the script, no need for any configurations. makes it way easier.

transeos · 2017-09-10T16:30:58Z

Sorry, I got busy with another got busy with another project.
I did the change just now. I'm doing the commit within an hr.

Also, accessing panel checks whether internet is working. If internet is down, no need for repeated reboot in every 8 min. Obviously there are other ways to check for internet, i thought checking for panel info is not bad approach.

krtschmr · 2017-09-10T16:44:21Z

in case the panel have falsy stats, no-updates (which happens sometimes for a few rigs if they go zombie-load !) or any other reasons we cant get the data we then actually dont get any current data.

i have sometimes a rig that goes high load but hashes like a champ. it simply is in high-load so i can't ssh into it and he can't update (but he hashes!). if a card fails, we will never see it via panel but via claymore-ethminer.exe or if we check locally. i think it's the better approach. the data is locally, why gather it remotely?

checking do we have internet is a nice thing tho

transeos · 2017-09-10T16:48:23Z

It happened to me once while mining xmr. Instead of gpu mining, it probably started cpu mining.

I think if the panel is not getting refreshed even once in 8 min, there is some problem which should be looked at.

transeos · 2017-09-10T17:15:19Z

I've pushed another change to handle above situation.

krtschmr · 2017-09-10T20:37:31Z

invalid url @ 2017-09-11 03:26:50.628405
invalid url @ 2017-09-11 03:30:50.651621

seems like something is wrong here

transeos · 2017-09-10T20:41:02Z

Please change the rig name and panel address to xxx if required and show me the output of "/home/ethos/gpu_crash.log".

krtschmr · 2017-09-10T21:18:51Z

i did send you an email

transeos · 2017-09-10T21:19:37Z

trying a workaround

krtschmr · 2017-09-10T21:58:43Z

i wonder how this can actually happen since the url should be always in the files.

maybe the best is to dump the json and switch to local reads, then we avoid this source of error.

how many rigs do you have to try out?

transeos · 2017-09-10T22:05:24Z

I've made rig name and panel url as optional arguments so that you can use them on those rigs where you are running into error.

krtschmr · 2017-09-11T08:22:09Z

today from ethosdistro channel
"Configmaker / Stats panel temporarily down. Will update with ETA when available. website/update/update2/get/paste are online"

so, i will fork and make everything locally :P

transeos · 2017-09-11T08:34:06Z

I'm also facing some issue.

krtschmr · 2017-09-11T09:02:37Z

    miner_hashes = map( float, commands.getstatusoutput("cat /var/run/ethos/miner_hashes.file")[1].split("\n")[-1].split() )
    numGpus = int(commands.getstatusoutput("cat /var/run/ethos/gpucount.file")[1])
    numRunningGpus = len(filter(lambda a: a > 0, miner_hashes))

we can use these and everything should work?

krtschmr · 2017-09-11T09:10:33Z

idea:
this shorty does kinda the same?

https://pastebin.com/s4VewKJB
edit:
this even better:
https://pastebin.com/8Zu5G5rA

transeos · 2017-09-11T09:53:27Z

Thanks.

I'll have a look later.

krtschmr · 2017-09-11T19:51:53Z

https://github.com/krtschmr/ethos_monitor/blob/master/check_crash.py

this works perfect now, including autoupdate before he reboots, in case we changed anything. i'll run this version for my farm now (but somehow my farm is stable since then. weired ;) )

transeos · 2017-09-11T20:57:27Z

Sorry, I'll be too busy in next 2 days to review this change.

ghost · 2018-02-04T18:11:12Z

@krtschmr is it work on ethos 1.2.9 ?

krtschmr · 2018-02-05T00:56:13Z

@LazyScream absolutely. However 1.2.9 wasn't stable for my farm so i kept them at 1.2.7.
The script itself will work forever untill they have major changes to the GPU-statistic.

ghost · 2018-02-05T01:37:16Z

@krtschmr
I found you do not need "rigname" and "ethosdistro.com/?json=yes" in your release
So just put check_crash.py under / home / ethos,And add "@reboot /home/ethos/ethos_monitor/check_crash.py" to crontab -e, your script will run automatically right?

krtschmr · 2018-02-05T01:43:01Z

almost :-)

wget https://raw.githubusercontent.com/krtschmr/ethos_monitor/master/check_crash.py
crontab -e
@reboot /home/ethos/check_crash.py
ctrl+o
python check_crash.py & # or you can run "r" for reboot

ghost · 2018-02-05T01:59:43Z

@krtschmr
ok ! thx all !
and do you have any ides for join 「Pushover」on this scrip ?

krtschmr · 2018-02-05T02:02:01Z

ya, google knows


import http.client, urllib
conn = http.client.HTTPSConnection("api.pushover.net:443")
conn.request("POST", "/1/messages.json",
  urllib.parse.urlencode({
    "token": "APP_TOKEN",
    "user": "USER_KEY",
    "message": "RIG OFFLINE!!! OMG, we are boke!",
  }), { "Content-type": "application/x-www-form-urlencoded" })
conn.getresponse()

ghost · 2018-02-05T02:42:38Z

Copy the code to any place on it
There are replacement APP_TOKEN, USER_KEY?
////
i got some error
File "./check_crash.py", line 22, in
import http.client
ImportError: No module named http.client

krtschmr · 2018-02-05T05:00:28Z

i really can't help with that, i'm not a specialist in python. obviously you need to bundle the http package first.

Trigun87 · 2018-02-27T08:32:16Z

i made a reboot function with telegram warning

from urllib import urlopen
from urllib import quote

def RebootRig():
  DumpActivity("Rebooting (" + str(miner_hashes) + ")")
  uptime = float(commands.getstatusoutput("cat /proc/uptime")[1].split()[0])
  m, s = divmod(uptime, 60)
  h, m = divmod(m, 60)
  msg = quote("Rig1 Reboot uptime " + str(h) + ":" + str(m) + ":" + str(s))
  urlopen("https://api.telegram.org/botXXX:APIKEY/sendmessage?chat_id=ID&text=" + msg).read()
  os.system("sudo hard-reboot")
  os.system("sudo reboot")

and now i'm using @krtschmr version
now i need only to test if the uptime var is workiing ^_^ (just use telegram botfather for make a new bot and get api)

Trigun87 · 2018-02-27T16:18:37Z

i think i found a bug on @krtschmr version...
in the disconnectcount part the script will check 12 times (without waiting) and after that it will trigger the break and the script stop
i think you need to place a reboot or a continue or something else and a time.sleep too
i changed in this way

 if (numRunningGpus != numGpus or numGpus != 13):

    if (waitForReconnect == 1 and numRunningGpus == 0):
      # all GPUs dead. propably TCP disconnect / pool issue
      # we wait 12 times to resolve these issues. this equals to 3 minutes. most likely appears with nicehash.
      disconnectCount += 1
      if (disconnectCount > 12):
        DumpActivity("Waiting for hashes back: " + str(disconnectCount))
        RebootRig()
        break
    else:
     disconnectCount = 0

    RebootRig()
    break
  time.sleep(15)

jmverges · 2018-03-01T07:54:47Z

@krtschmr is what is saying @Trigun87 true?

krtschmr · 2018-03-01T07:55:40Z

i don't know yet, had no time to look into, still trying to get new 600 gpu farm stable....

i can fix it later

jmverges · 2018-03-01T07:56:48Z

600 gpu? 😮

Trigun87 · 2018-03-03T18:23:52Z

ok i fixed the check for disconnect (the var waitForReconnect was useless since was always 1)

https://github.com/Trigun87/ethos_monitor

i just forked ^_^ i use a new file for telegram warning (default disabled) and number of gpus on the rig (if start with less gpu it will reboot)

krtschmr · 2018-03-06T06:38:18Z

@Trigun87 wanna merge into my one?

Trigun87 · 2018-03-10T11:29:13Z

@krtschmr if u like my version ^_^ (btw is something u should do or something i should do ? never merged anything :-P)

krtschmr · 2018-03-12T10:09:09Z

@jmverges how to work in this ethOS FRiends group? i cant create repositories or do anything...

@Trigun87 check gist:

so, my problem is that nicehash terminates the connections sometimes, and/or i dont have work. if i reboot, then they are hashing. sometimes 3/4 farm is dead over night. the issue is the reboot script. ethos 1.2.7 ( all <1.2.9) have issues then with claymore, still reporting SOME hashrate, even tho it's zero. i can't upgrade to 1.3.0 since powerplay messes up and we would use 8% more electricity

this should fix it. maybe usefull for anybody?
https://gist.github.com/krtschmr/a915ee7fa9c9c42961a2376dfebf208b

transeos closed this as completed Sep 10, 2017

transeos reopened this Sep 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use local files instead of arguments #1

use local files instead of arguments #1

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017 •

edited

Loading

transeos commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017 •

edited

Loading

transeos commented Sep 10, 2017

krtschmr commented Sep 11, 2017 •

edited

Loading

transeos commented Sep 11, 2017

krtschmr commented Sep 11, 2017

krtschmr commented Sep 11, 2017 •

edited

Loading

transeos commented Sep 11, 2017

krtschmr commented Sep 11, 2017

transeos commented Sep 11, 2017

ghost commented Feb 4, 2018

krtschmr commented Feb 5, 2018

ghost commented Feb 5, 2018 •

edited by ghost

Loading

krtschmr commented Feb 5, 2018

ghost commented Feb 5, 2018

krtschmr commented Feb 5, 2018 •

edited

Loading

ghost commented Feb 5, 2018 •

edited by ghost

Loading

krtschmr commented Feb 5, 2018

Trigun87 commented Feb 27, 2018 •

edited

Loading

Trigun87 commented Feb 27, 2018 •

edited

Loading

jmverges commented Mar 1, 2018

krtschmr commented Mar 1, 2018

jmverges commented Mar 1, 2018

Trigun87 commented Mar 3, 2018 •

edited

Loading

krtschmr commented Mar 6, 2018

Trigun87 commented Mar 10, 2018

krtschmr commented Mar 12, 2018

use local files instead of arguments #1

use local files instead of arguments #1

Comments

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017 • edited Loading

transeos commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017

transeos commented Sep 10, 2017

krtschmr commented Sep 10, 2017 • edited Loading

transeos commented Sep 10, 2017

krtschmr commented Sep 11, 2017 • edited Loading

transeos commented Sep 11, 2017

krtschmr commented Sep 11, 2017

krtschmr commented Sep 11, 2017 • edited Loading

transeos commented Sep 11, 2017

krtschmr commented Sep 11, 2017

transeos commented Sep 11, 2017

ghost commented Feb 4, 2018

krtschmr commented Feb 5, 2018

ghost commented Feb 5, 2018 • edited by ghost Loading

krtschmr commented Feb 5, 2018

ghost commented Feb 5, 2018

krtschmr commented Feb 5, 2018 • edited Loading

ghost commented Feb 5, 2018 • edited by ghost Loading

krtschmr commented Feb 5, 2018

Trigun87 commented Feb 27, 2018 • edited Loading

Trigun87 commented Feb 27, 2018 • edited Loading

jmverges commented Mar 1, 2018

krtschmr commented Mar 1, 2018

jmverges commented Mar 1, 2018

Trigun87 commented Mar 3, 2018 • edited Loading

krtschmr commented Mar 6, 2018

Trigun87 commented Mar 10, 2018

krtschmr commented Mar 12, 2018

krtschmr commented Sep 10, 2017 •

edited

Loading

krtschmr commented Sep 10, 2017 •

edited

Loading

krtschmr commented Sep 11, 2017 •

edited

Loading

krtschmr commented Sep 11, 2017 •

edited

Loading

ghost commented Feb 5, 2018 •

edited by ghost

Loading

krtschmr commented Feb 5, 2018 •

edited

Loading

ghost commented Feb 5, 2018 •

edited by ghost

Loading

Trigun87 commented Feb 27, 2018 •

edited

Loading

Trigun87 commented Feb 27, 2018 •

edited

Loading

Trigun87 commented Mar 3, 2018 •

edited

Loading