Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge transactions older than ignition max stamp #183

Conversation

efernandez
Copy link
Collaborator

Consider an ignition sensor transaction I with minimum and maximum involved stamps [I_min, I_max].

And another transaction T with minimum and maximum involved stamps [T_min, T_max].

Now consider I represents an entire graph recorded, so we have I_max = I_min + l where l is approximately the lag_duration.

We want to process I and the first transaction T after that graph was generated, so it adds
constraints to existing variables or removes existing constraints or variables.

For this to happen we must purge any transaction T that has:

  • T_min < I_min, or
  • T_max <= I_max

Note that T_max == I_max for the T that generated the graph recorded, represented by I.

This conditions should also be valid for other use cases.

In the particular case of I, we also want T_max to be exactly the one after I_max. For this we need to disable the reset service in the ignition sensor, so we do not lose any transaction.

Consider an ignition sensor transaction `I` with minimum and maximum
involved stamps `[I_min, I_max]`.

And another transaction `T` with minimum and maximum involved stamps
`[T_min, T_max]`.

Now consider `I` represents an entire graph recorded, so we have
`I_max = I_min + l` where `l` is approximately the `lag_duration`.

We want to process `I` and the first transaction `T` after that graph
was generated, so it adds constraints to existing variables or removes
existing constraints or variables.

For this to happen we must purge any transaction `T` that has:
* `T_min < I_min`, or
* `T_max <= I_max`

Note that `T_max == I_max` for the `T` that generated the graph recorded,
represented by `I`.

This conditions should also be valid for other use cases.

In the particular case of `I`, we also want `T_max` to be exactly the
one after `I_max`. For this we need to disable the `reset` service in
the ignition sensor, so we do not lose any transaction.
@@ -425,12 +425,12 @@ void FixedLagSmoother::transactionCallback(
{
// If this transaction occurs before the start time, just ignore it
auto start_time = getStartTime();
const auto& min_time = transaction->minStamp();
if (started_ && min_time < start_time)
const auto max_time = transaction->maxStamp();
Copy link
Collaborator Author

@efernandez efernandez Apr 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that we were doing const auto& min_time, which is a const reference to an attribute inside transaction, which is std::moved when it gets inserted into the pending_transactions_, but still used later.

This wasn't an issue at the moment, but it could be a potential issue in the future if something changes the std::moved transaction. For that reason I'm taking a copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good catch.

Copy link
Collaborator Author

@efernandez efernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments to help with the review.

I think we could extend the existing FixedLagIgnition test to exercise this case, if you're happy with this implementation.

{
pending_transactions_.pop_back();
// And purge out old transactions to limit the pending size while waiting for an ignition sensor
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for using maxStamp() instead of minStamp(), this is the same logic as before for the case when started_ is still false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rather indifferent on which stamp gets used here. The sole purpose of this block is to prevent the pending queue from getting too large while waiting for the ignition transaction.

//
// TODO(efernandez) Do '&min_time = std::as_const(start_ime)' when C++17 is supported and we can use
// std::as_const: https://en.cppreference.com/w/cpp/utility/as_const
pending_transactions_.erase(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This performs the full interval check, so all other transactions [min, max] intervals are after the ignition sensor transaction interval.

Equality is allowed for the min, but not for the max. This is because we don't want a transaction that happened at the same time as the ignition sensor transaction. See the motivation and use case in this PR description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this purge is only performed once, when the first ignition transaction is received. Because of the asynchronous nature of everything, it is possible that your initial graph transaction, G1, is received before the accompanying transaction, T1. In such a case, this purge code will check will fire and then T1 is received. Since T1.maxStamp() == start_time, the check above if (started_ && max_time < start_time) will also not purge out the T1 transaction. I feel like those two checks should both use <= start_time

@@ -448,7 +448,7 @@ void FixedLagSmoother::transactionCallback(
pending_transactions_.end(),
transaction->stamp(),
comparator);
pending_transactions_.insert(position, {sensor_name, std::move(transaction)}); // NOLINT
position = pending_transactions_.insert(position, {sensor_name, std::move(transaction)}); // NOLINT
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required because position could be end() before inserting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@svwilliams
Copy link
Contributor

Apologies, but I am not quite following the argument in the PR description.

Since we are talking about the ignition sequence, I am assuming the graph, G, starts out as an empty set: G{}

We then receive an ignition transaction I that spans the timestamps I_min to I_max, I[I_min, I_max]. And we also have another transaction in the queue, T[T_min, T_max]. And the question is: under what conditions should we accept T and under what conditions should we reject T?

The processing sequence should go:

  • The graph is empty G{}
  • The first ignition transaction, I[I_min, I_max], is identified and processed separately.
  • A motion model transaction is generated between the involved stamps of I, M[I_min, I_max]
  • The ignition transaction and motion model transaction are added to the graph, G{I[I_min, I_max], M[I_min, I_max]}
  • The graph is optimized and published
  • A marginal transaction is computed to remove variables older than I_max - lag_duration. For the purposes of this conversation, lets assume no variables need to be marginalized.

Next optimization iteration:

  • A transaction T[T_min, T_max] is in the queue
  • Now what?

Case T_min < I_min:

This case is bad because of how variables are initialized. We treated the ignition transaction is a special way to allow all of the parts of the system to settle on an optimized initial state. Any transaction that involves a stamp before the initial state will get initialized to zero and likely cause convergence problems. Aside from that, having a hard initial time is probably a good idea and generally keeps everything sane.

Case I_max < T_min < T_max:

This is the easy, expected case. This is fine.

  • The graph is G{I[I_min, I_max], M[I_min, I_max]}
  • The queued transaction T[T_min, T_max] is processed
  • A motion model transaction is generated to connect the last graph state to the new transaction, as well as between the involved stamps of the transaction T, M[I_max, T_min, T_max]
  • The queued transaction and motion model transaction are added to the graph, G{I[I_min, I_max], M[I_min, I_max], T[T_min, T_max], M[I_max, T_min, T_max]}
  • The graph is optimized and published
  • A marginal transaction is computed to remove old variables
    Time marches on.

Case I_min < T_min < I_max < T_max:

This case seems fine to me, and it is not a condition you call out in your description as problematic.

  • The graph is G{I[I_min, I_max], M[I_min, I_max]}
  • The queued transaction T[T_min, T_max] is processed
  • A motion model transaction is generated to connect the states of the new transaction with the existing states. This is less clean than the "expected" case. The time sequence is: (I_min, T_min, I_max, T_max). The motion model must remove the previously generated transaction M[I_min, I_max] and regenerate constraints for that same period but in different chunks. M[I_min, I_max] is removed, and M[I_min, T_min, I_max, T_max] is added.
  • The queued transaction and motion model transaction are added to the graph, G{I[I_min, I_max], T[T_min, T_max], M[I_min, T_min, I_max, T_max]}
  • The graph is optimized and published
  • A marginal transaction is computed to remove old variables

Case I_min < T_min < T_max < I_max:

This case seems basically the same as the previous case to me, but you call this out explicitly as not okay. I'm hoping you can help me understand why.

  • The graph is G{I[I_min, I_max], M[I_min, I_max]}
  • The queued transaction T[T_min, T_max] is processed
  • A motion model transaction is generated to connect the states of the new transaction with the existing states. This is less clean than the "expected" case. The time sequence is: (I_min, T_min, T_max, I_max). The motion model must remove the previously generated transaction M[I_min, I_max] and regenerate constraints for that same period but in different chunks. M[I_min, I_max] is removed, and M[I_min, T_min, T_max, I_max] is added.
  • The queued transaction and motion model transaction are added to the graph, G{I[I_min, I_max], T[T_min, T_max], M[I_min, T_min, T_max, I_max]}
  • The graph is optimized and published
  • A marginal transaction is computed to remove old variables

@efernandez
Copy link
Collaborator Author

efernandez commented Apr 30, 2020

  • A motion model transaction is generated between the involved stamps of I, M[I_min, I_max]
  • The ignition transaction and motion model transaction are added to the graph, G{I[I_min, I_max], M[I_min, I_max]}

Minor comment: I'm actually considering no motion model is used, since my intention is to recreated the intermediate graphs that where throttle and therefore not recorded, using the transactions.

Case T_min < I_min
Case I_max < T_min < T_max
Case I_min < T_min < I_max < T_max

I agree on all these cases, and this change does what you're saying.
Only T_min < I_min gets purged out of all these cases.

Case I_min < T_min < T_max < I_max

Yes, this is the case I want to purged out in this change.
More specifcally, I_min < T_min < T_max <= I_max.
Or just T_max <= I_max, since the T_min < I_min case has already been covered.

The reason I don't want/think a transaction with T_max <= I_max should go into the graph is because it was generated before or at the time stamp of the ignition sensor I_max.

Use Case

Let me explain you my use case again in more detail, so it's more clear.

Recording Phase

  • I have a configuration with a fuse_publishers::SerializedPublisher that outputs the graphs G and the transactions T at full/optimization rate.
  • I rosbag record all transactions T
  • I cannot rosbag record all graphs G because that would quickly take a lot of disk and it also takes a lot of BW
  • So I rosrun topic_tools throttle messages the graphs G, so I only record the throttled graphs.

I end up with a bag file with all transactions T_i and only the throttle graphs G_j.

For example, let's say we recorded something like this:
[G_0, T_0, T_1, T_2, T_3, G_3, T_4, T_5, T_6, G_7, T_7]

IMPORTANT The following properties are important to see why I want to purge out T_max <= I_max:

  • Here _i is equivalent to _max
  • A transaction T_i is published together with a graph G_i, so T_i == G_i (as time stamps).
  • A graph G_i can be generated from a previous graph G_i-1 and the current transaction T_i as G_i = G_i-1 + T_i

Playback Phase

  • I have a configuration with:
    1. An ignition sensor that takes a fuse_msgs::SerializedGraph message and makes it into a transaction that the optimizer can use.
    2. A sensor model that simply takes fuse_msgs::SerializedTransaction messages and pass them to the optimizer.
  • I rosbag play the bag file remapping the topics so they are consumed by those sensor models and they don't collide with the output ones.
  • I also have a node so only the first graph published from the bag file is sent to the ignition sensor.

Now, let's assume we start playing the bag file at time B, such that G_0 < B < T_0.

We'll have this sequence of events:

  • T_0, T_1, T_2 and T_3 are published and they're queued.
  • G_3 is published. It's the first graph published, so the ignition sensor receives it.
  • All the transactions in the queue have T_i <= G_3 for i = 0 .. 3. So they're purged out.
  • 👉 If we don't purge T_3, we'll be applying its constraints to G_3, that already has them.
  • G_3 is a transaction from an ignition sensor and it's processed alone.
  • If T_4 had been in the queue, it'd be processed after.
  • T_4 is received and processed. This is the exact transaction we should apply after G_3, so we effectively get G_4 = G_3 + T_4. It can't be any other.
  • From now on all the transactions T_i for i >= 5 are received and processed.
  • No other graphs are processed because for this particular use case I only want the first one.

A unit test could be created to confirm that G_7 = G_3 + T_4 + T_5 + T_6 + T_7 is the same as the G_7 recorded in the bag file, as long as the same optimization configuration is used.
If a different configuration is used, the constraints should still be the same, but the optimized state / variables might differ.

Motivation

I think it's clear the motivation here is to play back the recorded (throttled) graph and transactions. This is useful to triage issues.

That being said, I'm open to other approaches. Indeed, this allows to:

  • Re-generate the graphs that didn't get recorded because of the throttling

But I'm still not sure if it's convenient enough for other things:

  • Since this use case is simply playing back the recorded graph and transactions, the resulting optimized state/variables would always be the same. This doesn't really allow to make and test changes. 😞
  • If the sensor models are internally throttling the input measurements with the throttle_period param, the transactions allow to see the time stamp of the variables actually used, which will match the measurements used. However, this information can't be feed back into the sensor models, because they're transactions. 😞

I could also request permissions to contribute the GraphIgnition and Transaction sensor models I've implemented for this. They're fairly simple.

@svwilliams
Copy link
Contributor

I welcome all contributions you want to submit. But I also want to remind you that these sensors models can also be released in your own package, giving you full control over development and release. Which ever way works best for you.

To be clear, I'm not against tightening up the filtering criteria. I was merely trying to reject as few transactions as possible. In practice, a "live" ignition sensor will likely involve only one timestamp, making the distinction between minStamp() and maxStamp() moot.

Possible issues:

  1. When processing the stream of sensor transactions the first time, during the recording phase, it is possible (though unlikely) that all of the sensor transactions processed in a given optimization cycle involve only older stamps. If the current graph spans the time [I_min, I_max], then all sensor transactions involve stamps t such that I_min <= t <= I_max. The published transaction stamp is derived from all included transactions. So such a case would result in a transaction T with a timestamp before the timestamp of the previous published graph or transaction. In the above example, where you are reconstructing the graph as G7 = G4 + T5 + T6 + T7, it is possible that the T5 transaction stamp is before the G4 graph stamp. As such, you might incorrectly purge the T5 transaction. We could stamp the transaction with something meaningful before they are published out of the optimizer here: https://github.com/locusrobotics/fuse/blob/devel/fuse_optimizers/src/fixed_lag_smoother.cpp#L207
    But I'm not sure what that meaningful stamp would be exactly.

  2. The published transactions, the T4, T5, etc. in your example, will include the marginalization transactions. However, the fixed-lag smoother used during the replay will also try to marginalize out those same variables. Did you set the lag duration to something large during the bagfile playback?

@efernandez
Copy link
Collaborator Author

efernandez commented May 1, 2020

  1. When processing the stream of sensor transactions the first time, during the recording phase, it is possible (though unlikely) that all of the sensor transactions processed in a given optimization cycle involve only older stamps. If the current graph spans the time [I_min, I_max], then all sensor transactions involve stamps t such that I_min <= t <= I_max. The published transaction stamp is derived from all included transactions. So such a case would result in a transaction T with a timestamp before the timestamp of the previous published graph or transaction. In the above example, where you are reconstructing the graph as G7 = G4 + T5 + T6 + T7, it is possible that the T5 transaction stamp is before the G4 graph stamp. As such, you might incorrectly purge the T5 transaction. We could stamp the transaction with something meaningful before they are published out of the optimizer here: https://github.com/locusrobotics/fuse/blob/devel/fuse_optimizers/src/fixed_lag_smoother.cpp#L207
    But I'm not sure what that meaningful stamp would be exactly.

I don't think this can happen with the current implementation. The reason is subtle though:

  • The fuse_core::Graph doesn't have any stamp, so when the fuse_publishers::SerializedPublisher publishes a <graph, transaction> pair, it uses the transaction stamp() for the graph:
    const auto& stamp = transaction->stamp();
    if (graph_publisher_.getNumSubscribers() > 0)
    {
    fuse_msgs::SerializedGraph msg;
    msg.header.stamp = stamp;
  • This guarantees the graph and transaction messages recorded can be synchronized when playing things back
  • In my GraphIgnition sensor model the code that converts the graph messages into a transaction I with all the graph constraints and variables, I set the transaction stamp to the one in the graph message, not the maximum involved stamp
  • This way, when we call maxStamp() for this transaction, we get the maximum of all involved stamps, but also considering the stamp():
    return std::max(*involved_stamps_.rbegin(), stamp_);
  • This way, I_max allows to purge out the transactions T already applied to the graphs easily, because they satisfy T_max <= I_max.
  • Indeed, we also now that I_max == T_max is exactly the last transaction recorded that was applied to the recorded graph.

So I think that meaningful stamp is already in place. 😃

  1. The published transactions, the T4, T5, etc. in your example, will include the marginalization transactions. However, the fixed-lag smoother used during the replay will also try to marginalize out those same variables. Did you set the lag duration to something large during the bagfile playback?

Yes, I disable marginalization because I can't allow new marginalization constraints to be generated.
I do that with a huge lag_duration. I'm using 1.0e+9. I can't be larger than the maximum ros::Time, i.e. it should be a value <= 2^31, for a 32-bit integer. But this is just a minor detail.

@efernandez efernandez self-assigned this May 2, 2020
@efernandez
Copy link
Collaborator Author

@svwilliams I'd like to hear your thoughts about this PR after my last comment. 😃

@svwilliams
Copy link
Contributor

I agree the graph uses the transaction stamp, so the graph and transaction are guaranteed to be synchronized. However, the transaction stamp (of the <graph, transaction> pair) is derived from merging the transactions in the pending queue during the processQueue() call in each optimization iteration. If the first published <graph, transaction> pair, <G1, T1>, includes sensor measurements from (I_min, I_max), it is theoretically possible (though generally unlikely) that a subsequent optimization iteration only includes sensor measurements with stamps that are < I_max. In such a case, the second published <graph, transaction> pair, <G2, T2>, will have a stamp I_mid such that I_min < I_mid < I_max.

Imagine some sort of processor-intensive visual place recognition system or such that publishes measurements with a significant delay. The transaction gets stamped with the image frame capture time, but the measurement isn't published until 100ms after the frame was captured. This could lead to the first transaction, T1, stamped with t=1.000s, and the second transaction, T2, stamped with t=0.900s. If we filter out all transactions with stamps <= 1.000s in order to skip processing T1, then T2 will get incorrectly filtered out as well.

I'm wondering if we need a strictly monotonically increasing stamp on the published <graph, transaction> pairs? Or synchronize on the sequence number? I like the theory of the sequence number a lot, but ROS's strict control over the sequence number field always makes me hesitate using it.

@ayrton04
Copy link
Contributor

What's the status of this PR?

@efernandez
Copy link
Collaborator Author

What's the status of this PR?

@ayrton04 My understanding is that is waiting on me to do a few more tests after @svwilliams ' comments. And we likely a different approach.

I've sent #195 and #196 so I can hopefully illustrate and motivate some changes here. But, obviously, only if that doesn't break the existing tests and functionality.

I set the question label to highlight that because there's no label for changes required or similar.

@svwilliams
Copy link
Contributor

I guess what I'm wondering is: Do we stamp the composite transaction that gets merged into the graph with the current time?

https://github.com/locusrobotics/fuse/blob/devel/fuse_optimizers/src/fixed_lag_smoother.cpp#L186

That would enforce the graph/transaction pair reported out of the smoother to have a strictly monotonically increasing timestamp. Since this transaction is a composite of multiple transactions as well as the motion models, I'm not sure there is an "expected" header stamp for this transaction. And I don't think any of the existing Publisher classes use the transaction header stamp for anything, so this change shouldn't break anything.

@efernandez
Copy link
Collaborator Author

Do we stamp the composite transaction that gets merged into the graph with the current time?

@svwilliams I believe that happens when we merge the transactions in https://github.com/locusrobotics/fuse/blob/devel/fuse_optimizers/src/fixed_lag_smoother.cpp#L295 and wherever we call merge, that sets the transaction stamp to the maximum of the two transactions being merged:

void Transaction::merge(const Transaction& other, bool overwrite)
{
stamp_ = std::max(stamp_, other.stamp_);

@svwilliams
Copy link
Contributor

Correct. Right now we choose the max stamp. I'm curious if we want to set it to ros::Time::now() to force that stamp to be monotonically increasing.

@efernandez
Copy link
Collaborator Author

efernandez commented Jun 4, 2020

I'm curious if we want to set it to ros::Time::now() to force that stamp to be monotonically increasing.

Maybe. I don't know.

I like that we can track down what sensor model input messages were used to generate the transaction using the timestamp. Actually, we can do that by looking at the constraint variables. The constraint has the source and its variables contain the timestamps of the messages used. So it probably doesn't matter much if the transaction stamp is the received time, i.e. ros::Time::now().

I have a tool that filters the sensor model inputs based on that, so I get the same inputs even when I'm throttling things out with the throttle_period ROS param. As long as I can still retrieve the sensor model inputs, it's ok for me. In this case it actually looks like it'd ok.

@svwilliams
Copy link
Contributor

@efernandez Are we ready for this to be merged in? Or are we still contemplating other changes?

@efernandez
Copy link
Collaborator Author

efernandez commented Jun 22, 2020

Are we ready for this to be merged in? Or are we still contemplating other changes?

@svwilliams I'm not contemplating any other changes, so it should be good to merge it in.

That being said, I've been thinking of some changes for the ignition/reset logic. It's a bit off topic, but I'd like to hear your thoughts. Below I try to explain my proposed changes in brief:

  1. As a building piece, let's add the ability to remove all the variables or constraints in the graph by specifying that in the Transaction and changing the Graph::update logic, which is currently implemented as follows, clearly not supporting that:
    void Graph::update(const Transaction& transaction)
    {
    // Update the graph with a new transaction. In order to keep the graph consistent, variables are added first,
    // followed by the constraints which might use the newly added variables. Then constraints are removed so that
    // the variable usage is updated. Finally, variables are removed.
    // Insert the new variables into the graph
    for (const auto& variable : transaction.addedVariables())
    {
    addVariable(variable.clone());
    }
    // Insert the new constraints into the graph
    for (const auto& constraint : transaction.addedConstraints())
    {
    addConstraint(constraint.clone());
    }
    // Delete constraints from the graph
    for (const auto& constraint_uuid : transaction.removedConstraints())
    {
    removeConstraint(constraint_uuid);
    }
    // Delete variables from the graph
    for (const auto& variable_uuid : transaction.removedVariables())
    {
    removeVariable(variable_uuid);
    }
    }
  2. Since we currently serialize the Transaction class, we don't want to break any previously serialized transaction recorded, so we can use fuse_core::uuid::NIL as a special value to indicate ALL variables or constraints, and use that in the removedConstraints() or removedVariables() to indicate we want all of them removed from the graph. This means any sensor model can request the graph to be cleared at any time by sending a transaction like the following to the optimizer:
fuse_core::Transaction transaction;
transaction.removeVariable(fuse_core::uuid::NIL);
transaction.removeConstraint(fuse_core::uuid::NIL);

The actual way we request all variables or constraints to be removed from the graphs could be done hiding the detail of the special UUID value used to indicate that. That means we could have:

transaction.removeAllVaraibles();
transaction.removeAllConstraints();

or probably only allow to remove all variables if also all constraints are removed, in which case we could simply expose a method to do transaction.removeAll(); or transaction.clear();, or something similar.
3. The Graph::update logic should handle this new case. This could be done with the following naive implementation:

void Graph::update(const Transaction& transaction)
{
  // Remove all constraints from the graph
  for (const auto& constraint_uuid : transaction.removedConstraints())
  {
    if (constraint_uuid == fuse_core::uuid::NIL)
    {
      clearConstraints();
      break;
    }
  }
  // Remove all variables from the graph
  for (const auto& variable_uuid : transaction.removedVariables())
  {
    if (variable_uuid == fuse_core::uuid::NIL)
    {
      clearVariables();
      break;
    }
  }
  // Insert the new variables into the graph
  for (const auto& variable : transaction.addedVariables())
  {
    addVariable(variable.clone());
  }
  // Insert the new constraints into the graph
  for (const auto& constraint : transaction.addedConstraints())
  {
    addConstraint(constraint.clone());
  }
  // Delete constraints from the graph
  for (const auto& constraint_uuid : transaction.removedConstraints())
  {
    if (constraint_uuid != fuse_core::uuid::NIL)
    {
      removeConstraint(constraint_uuid);
    }
  }
  // Delete variables from the graph
  for (const auto& variable_uuid : transaction.removedVariables())
  {
    if (variable_uuid != fuse_core::uuid::NIL)
    {
      removeVariable(variable_uuid);
    }
  }
}

where clearConstraints() and clearVariables() could be new pure virtual methods:

virtual void clearConstraints() = 0;
virtual void clearVariables() = 0;

that could be efficiently implemented as follows for the fuse_graphs::HashGraph:

void clearConstraints()
{
  constraints_.clear();
  constraints_by_variable_uuid_.clear();
}

void clearVariables()
{
  variables_.clear();
  variables_on_hold_.clear();
}
  1. I believe Transaction::merge() should work as is:
    void Transaction::merge(const Transaction& other, bool overwrite)
    {
    stamp_ = std::max(stamp_, other.stamp_);
    involved_stamps_.insert(other.involved_stamps_.begin(), other.involved_stamps_.end());
    for (const auto& added_constraint : other.added_constraints_)
    {
    addConstraint(added_constraint, overwrite);
    }
    for (const auto& removed_constraint : other.removed_constraints_)
    {
    removeConstraint(removed_constraint);
    }
    for (const auto& added_variable : other.added_variables_)
    {
    addVariable(added_variable, overwrite);
    }
    for (const auto& removed_variable : other.removed_variables_)
    {
    removeVariable(removed_variable);
    }
    }

    It shouldn't be a problem to have more than one fuse_core::uuid::NIL removed variable or constraint, although it'd be handled as a single one. Similarly, with the implementation proposed above for the Graph::update() method, the removal of all variables or constraints in the graph would happen before any additions. This allows an ignition sensor to create a new prior constraint at the same time it request the current graph to be cleared. For instance, the fuse_models::Unicycle2DIgnition that sends this transaction:
    auto transaction = fuse_core::Transaction::make_shared();
    transaction->stamp(stamp);
    transaction->addInvolvedStamp(stamp);
    transaction->addVariable(position);
    transaction->addVariable(orientation);
    transaction->addVariable(linear_velocity);
    transaction->addVariable(angular_velocity);
    transaction->addVariable(linear_acceleration);
    transaction->addConstraint(position_constraint);
    transaction->addConstraint(orientation_constraint);
    transaction->addConstraint(linear_velocity_constraint);
    transaction->addConstraint(angular_velocity_constraint);
    transaction->addConstraint(linear_acceleration_constraint);
    // Send the transaction to the optimizer.
    sendTransaction(transaction);

    Would simply have to add this to that transaction before calling sendTransaction:
transaction->removeVariable(fuse_core::uuid::NIL);
transaction->removeConstraint(fuse_core::uuid::NIL);

It shouldn't be a problem if this transactions get merged with others, that would add more variables and constraints, because the graph would be cleared before adding any variables or constraints. This will still rely on the optimizer to purge any transaction before the ignition one. If we want, I think it'd be possible to detect a transaction has fuse_core::uuid::NIL, and consider that as an ignition transaction, or a non-mergeable transaction. Technically, it can get merged with future transaction, but not with past ones. But if we never merge it, it'd be fine, and equivalent to what we do now.

I believe this would allow us to get rid of the reset logic in the ignition sensor models:

if (!params_.reset_service.empty())
{
// Wait for the reset service
while (!reset_client_.waitForExistence(ros::Duration(10.0)) && ros::ok())
{
ROS_WARN_STREAM("Waiting for '" << reset_client_.getService() << "' service to become avaiable.");
}
auto srv = std_srvs::Empty();
if (!reset_client_.call(srv))
{
// The reset() service failed. Propagate that failure to the caller of this service.
throw std::runtime_error("Failed to call the '" + reset_client_.getService() + "' service.");
}
}

This would make them easier to implement. It'd also make things faster to update and we wouldn't lose any transactions that we currently lose when we call the reset service. That happens because it takes time to reset, and it stops all the plugins, so they stop receiving measurements. The approach proposed here is equivalent to not using the reset service, which can actually be used now. So certainly, this functionality already exists. The only difference is that with this change we can indicate we want the graph to be cleared at the transaction level, which I believe we can't do now.

I wonder if ignition sensor really have to support calling the reset service. After all, if the graph gets cleared, all plugins will get notified about that. Maybe it's too late and some sensor models could have already generated constraints that reference variables from the graph before clearing it. 🤔

@svwilliams
Copy link
Contributor

I am definitely not opposed to streamlining the reset operation. Adding a reset option to the transaction is an interesting idea, but I'll need to spend some time to fully understand all of the ramifications.

But a few immediate thoughts:

  • I dislike abusing the UUID and giving NIL a special meaning. I'd rather extend the transaction to directly support the reset feature. The Boost Serialization system has support for deserializing different versions of the same class, so it should be possible to extend the Transaction and still allow older versions to be deserialized. To be clear, I've never used that feature, but in theory it is possible.
    https://www.boost.org/doc/libs/1_65_0/libs/serialization/doc/tutorial.html#versioning
  • One of the reasons for having the sensors do an unsubscribe - subscribe cycle during reset is to ensure the ROS callback queues get cleared of old data and we start fresh after the reset. It also gives the sensors and motion models a hook to clear out any internal state. While most of the sensor models have rather trivial reset logic, the internal state clearing is important to the motion model state tracking. In any alternative reset implementation, we need to make sure the sensors and motions have that reset hook, and that the reset gets triggered at the correct time.
    https://github.com/locusrobotics/fuse/blob/devel/fuse_models/src/unicycle_2d.cpp#L225
  • If the goal is to make the reset operation "faster", there may be better sensor reset logic. Maybe we clear the callback queue instead of doing an unsubscribe-subscribe cycle? Maybe we transmit the reset time to the sensors/motion models and let them filter out unwanted messages?
  • Before the addition of reset logic, the stop()/start() methods were added to support the robot_localization "enable service" feature. I think that is an important feature to keep, and having the stop() methods actually unsubscribe from the topics seems like the proper operation. The reset implementation then piggybacked off of those functions and did a stop()-start() cycle. We could make a special reset() method in addition to the stop() and start() methods, with a default implementation that does a stop()-start() cycle. That would allow special "fast" reset logic to be implemented.

If you want to pursue this, we can work out some way of collaborating on the feature. A shared branch, or a PR I have push access to, or something.

Since I want to merge this PR, I'm going to copy this conversation into an issue first. We can continue the discussion there.

@svwilliams
Copy link
Contributor

See issue #199

@svwilliams svwilliams merged commit 2239cff into locusrobotics:devel Jun 24, 2020
@efernandez efernandez deleted the purge-transactions-older-than-ignition-max-stamp branch June 24, 2020 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants