You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the inputs to student policy(neural network) is the reference direction, which as per the paper description "The latter is represented as a normalised vector heading toward the reference point 1s in the future with respect to the closest reference state".
Now under the given code, the information about reference state is used from reference_trajectory.csv file. The reference trajectory being calculated using full information, to quote "To bias the sampled trajectories toward obstacle-free regions, we replace the raw reference trajectory τ_ref in Equation 2 with a global collision-free trajectory τ_gbl from start to goal, that we compute using the approach of Liu et al. [13]".
As per the paper,
The trajectory τ_gbl is not observed by the student policy, but only by the expert. The student is only provided with a straight, potentially not collision-free, trajectory from start to end to convey the goal.
However, in the training of neural network, the reference/goal direction is computed using data from reference.csv (τ_gbl, globally optimal, collision free). Isn't it leading to data leakage? The information about globally optimal trajectory is already seen by the network
At test time, i.e. on a quadrotor, the information about τ_gbl would not be available, only a direct approach path, possibly with obstacles. What happens in that case? How is the goal direction calculation done?
One of the inputs to student policy(neural network) is the reference direction, which as per the paper description "The latter is represented as a normalised vector heading toward the reference point 1s in the future with respect to the closest reference state".
Now under the given code, the information about reference state is used from reference_trajectory.csv file. The reference trajectory being calculated using full information, to quote "To bias the sampled trajectories toward obstacle-free regions, we replace the raw reference trajectory τ_ref in Equation 2 with a global collision-free trajectory τ_gbl from start to goal, that we compute using the approach of Liu et al. [13]".
As per the paper,
The trajectory τ_gbl is not observed by the student policy, but only by the expert. The student is only provided with a straight, potentially not collision-free, trajectory from start to end to convey the goal.
However, in the training of neural network, the reference/goal direction is computed using data from reference.csv (τ_gbl, globally optimal, collision free). Isn't it leading to data leakage? The information about globally optimal trajectory is already seen by the network
At test time, i.e. on a quadrotor, the information about τ_gbl would not be available, only a direct approach path, possibly with obstacles. What happens in that case? How is the goal direction calculation done?
@antonilo
The text was updated successfully, but these errors were encountered: