Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOFA-Registry should be tolerant of clock adjustments #326

Open
ashlee618 opened this issue Jun 20, 2023 · 7 comments
Open

SOFA-Registry should be tolerant of clock adjustments #326

ashlee618 opened this issue Jun 20, 2023 · 7 comments
Assignees

Comments

@ashlee618
Copy link
Contributor

Your question

At present, SOFA-Registry relies more on time, but there is no good classification for the use of time.
All time dependencies in the project are on the wall-clock time System.currentTimeMillis.
This may not be a good idea for distributed architectures, especially in the financial field.

Your scenes

Scenarios for calculating timeout, for example:

DataChangeEventCenter#geExpires

  List<ChangeNotifier> getExpires() {
    final List<ChangeNotifier> expires = Lists.newLinkedList();
    final long now = System.currentTimeMillis();
    synchronized (retryNotifiers) {
      final Iterator<ChangeNotifierRetry> it = retryNotifiers.iterator();
      while (it.hasNext()) {
        ChangeNotifierRetry retry = it.next();
        // Comparing with before configured timeout
        if (retry.expireTimestamp <= now) {
          expires.add(retry.notifier);
          it.remove();
        }
      }
    }
    return expires;
  }

DataChangeEventCenter#commitRetry

  boolean commitRetry(ChangeNotifier retry) {
    final int maxSize = dataServerConfig.getNotifyRetryQueueSize();
    // here to set expireTime
    final long expireTimestamp =
        System.currentTimeMillis() + dataServerConfig.getNotifyRetryBackoffMillis();
    synchronized (retryNotifiers) {
      if (retryNotifiers.size() >= maxSize) {
        // remove first
        retryNotifiers.removeFirst();
      }
      retryNotifiers.add(new ChangeNotifierRetry(retry, expireTimestamp));
    }
    return true;
  }

This kind of time-out detection using a monotone clock is better in distributed systems.

Your advice

maybe we can take a look at zookeeper and separate the time-dependent functions in the project by time point and time interval, and then choose a wall clock or a monotonic clock to solve the problem.

https://svn.apache.org/viewvc?view=revision&revision=1657745

git version :2789d1df02e697ea511867546dc569ff6b405ece

Environment

  • SOFARegistry version:
  • JVM version (e.g. java -version):
  • OS version (e.g. uname -a):
  • Maven version:
  • IDE version:
@NickNYU
Copy link
Contributor

NickNYU commented Jun 20, 2023

Thanks for advice, but what I see from ZK's perspective, is that, ZK do the same stuff through wall clock:

package org.apache.zookeeper.common;

import java.util.Date;

public class Time {
    /**
     * Returns time in milliseconds as does System.currentTimeMillis(),
     * but uses elapsed time from an arbitrary epoch more like System.nanoTime().
     * The difference is that if somebody changes the system clock,
     * Time.currentElapsedTime will change but nanoTime won't. On the other hand,
     * all of ZK assumes that time is measured in milliseconds.
     * @return  The time in milliseconds from some arbitrary point in time.
     */
    public static long currentElapsedTime() {
        return System.nanoTime() / 1000000;
    }

    /**
     * Explicitly returns system dependent current wall time.
     * @return Current time in msec.
     */
    public static long currentWallTime() {
        return System.currentTimeMillis();
    }

    /**
     * This is to convert the elapsedTime to a Date.
     * @return A date object indicated by the elapsedTime.
     */
    public static Date elapsedTimeToDate(long elapsedTime) {
        long wallTime = currentWallTime() + elapsedTime - currentElapsedTime();
        return new Date(wallTime);
    }
}

So, I'm gonna ask you about what do you mean by saying monotone clock? Logic clock is not capable of calculating time expiration, only wall clock does....., as well as I known, event Google Spanner could not guarantee an increasing order of clock with a atomic clock inside each machine running Spanner.

@ashlee618
Copy link
Contributor Author

Thank you for your reply. I agree with your statement, but I think you misunderstood my meaning.

I guess you're talking about sequences, a stricter sort relationship.

What I mean is that we can replace some System.currentTimeMillis with a monotonic clock like Sytem.nanoTime, so that the application can tolerate clock induced issues such as drift and clock jumping.

@nocvalight
Copy link
Member

Thank you for your advice. I think your advice is reasonable. In some use cases of sofa registry, such as obtaining timeout tasks within a certain time interval, it is indeed difficult to avoid the impact of system clock time jumps when using System.currentTimeMillis() in the code. In this scenario, using System.nanoTime() is indeed more appropriate.

@ashlee618
Copy link
Contributor Author

please assign me.

@ashlee618
Copy link
Contributor Author

I will try to organize their dependence on time, and do you have any good suggestions?

Is the scope of our refactoring the entire project or just some features.

@nocvalight
Copy link
Member

please assign me.

assigned

@nocvalight
Copy link
Member

nocvalight commented Jun 23, 2023

I will try to organize their dependence on time, and do you have any good suggestions?

Is the scope of our refactoring the entire project or just some features.

The scenario we discussed above is for comparing a time interval on the same machine, such as whether a push task has expired. In this scenario, I think using System.nanoTime() is appropriate. At the same time, there are also some other scenarios in SFARegistry, such as generating a DatumVersion after the datum is changed, such as the following code:

public static long confregNextId(long lastVersion) {
    long timestamp = timeGen();
    if (lastVersion < timestamp) {
      return timestamp;
    }
    if (lastVersion > timestamp + 10) {
      LOG.warn(
          "[DatumVersion] last version {} is higher than currentTimestamp {}",
          lastVersion,
          timestamp);
    }
    return lastVersion + 1;
  }

The DatumVersion may generate from different machine and needs to be compared, and in this scenario, what needs to be compared is the global time of different machines. In this scenario, using System.nanoTime() is not suitable.
So, I think that our refactoring scope should be focused on scenarios where time comparison is made on the same machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants