Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaiad cannot reconnect KMS unless restart gaiad #3190

Closed
4 tasks
yulidai opened this issue Dec 22, 2018 · 11 comments
Closed
4 tasks

Gaiad cannot reconnect KMS unless restart gaiad #3190

yulidai opened this issue Dec 22, 2018 · 11 comments
Assignees
Labels
C:Keys Keybase, KMS and HSMs T:Bug
Milestone

Comments

@yulidai
Copy link

yulidai commented Dec 22, 2018

Summary of Bug

In order to use tendermint/kms with gaiad, i'm changing the priv_validator_laddr, then they connect success and it's good to work together.

But problem occured after restarting kms, gaiad will keep printing Ping module=privval err="remote signer timed out". They will never be able to reconnect unless i restart gaiad

Steps to Reproduce

Start kms.
Start gaiad with kms's pubkey.

Then restart kms

Edit:
gaiad version: 0.28.0-0-g68019be
tmkms's version: 0.23

Edit2:
Same problem with version: 0.29.0-0-g2b3842c


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@jackzampolin jackzampolin added T:Bug C:Keys Keybase, KMS and HSMs labels Jan 3, 2019
@jackzampolin
Copy link
Member

@tarcieri
Copy link

tarcieri commented Jan 3, 2019

This seems like a dupe of:

tendermint/tendermint#2876
tendermint/tmkms#116

@jackzampolin
Copy link
Member

@tarcieri is that included in a release yet?

@tarcieri
Copy link

tarcieri commented Jan 4, 2019

To my knowledge this specific bug is addressed in 0.29 (see tendermint/tendermint#2876) however there is a similar-but-distinct issue tendermint/tendermint#2923 which is as-yet-unresolved but I believe does not result in this specific failure mode.

@jleni
Copy link
Member

jleni commented Feb 15, 2019

I can also reproduce a situation where even if the connection with the KMS is still present, gaiad will never ask again for a signature. The only way is to restart gaiad.

This directly affects Ledger's initialization.

Repro steps:

  • If I have a ledger device connected to the KMS as a signer
  • The prevote/vote arrives to the device and the Ledger app requests manual confirmation.
  • Now there are two alternatives:
    1- The user rejects => Rejection
    2- or the user takes a long to time accept => Timeout
  • gaiad receives an error (1. no signature or 2. timeout)

After that, it will never request signatures from the KMS again.

E[2019-02-15|13:47:26.684] enterPropose: Error signing proposal         module=consensus height=520 round=0 err=EOF
E[2019-02-15|13:47:26.684] Ping                                         module=privval err=EOF
E[2019-02-15|13:47:30.231] Error signing vote                           module=consensus height=520 round=0 vote="Vote{0:EC58323E4EE6 520/00/1(Prevote) 000000000000 000000000000 @ 2019-02-15T12:47:27.694819811Z}" err=EOF
E[2019-02-15|13:47:30.231] Ping                                         module=privval err=EOF
E[2019-02-15|13:47:39.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:48:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:48:39.637] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:49:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:49:39.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:50:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:50:39.637] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:51:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:51:39.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:52:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:52:39.632] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:53:09.621] Couldn't connect to any seeds                module=p2p 
E[2019-02-15|13:53:39.621] Couldn't connect to any seeds                module=p2p 

@jleni
Copy link
Member

jleni commented Feb 15, 2019

Linking tendermint/tmkms#172 as a related issue

@jleni
Copy link
Member

jleni commented Feb 18, 2019

👍

@liamsi
Copy link
Contributor

liamsi commented Feb 18, 2019

ref: tendermint/tendermint#2923 (comment) for the 2nd case
tl;dr: the timeout is configurable:timeout_propose in your config.toml AFAIR

This does not solve the 1st issue though.

@jleni
Copy link
Member

jleni commented Feb 25, 2019

@yulidai Were you running a single node?

@cwgoes
Copy link
Contributor

cwgoes commented Mar 15, 2019

@jleni We've fixed this upstream, right?

In any case, this is a Tendermint issue, not an SDK issue.

@cwgoes cwgoes closed this as completed Mar 15, 2019
@yulidai
Copy link
Author

yulidai commented Mar 18, 2019

It's ok, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:Keys Keybase, KMS and HSMs T:Bug
Projects
None yet
Development

No branches or pull requests

6 participants