xTED

Cross-Domain Adaptation via Diffusion-Based Trajectory Editing

Abstract

Reusing pre-collected data from different domains is an appealing solution for decision-making tasks that have insufficient data in the target domain but are relatively abundant in other related domains. Existing cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning, such as learning domain/task-specific discriminators, representations, or policies. This design philosophy often results in heavy model architectures or task/domain-specific modeling, lacking flexibility. This reality makes us wonder: can we directly bridge the domain gaps universally at the data level, instead of relying on complex downstream cross-domain policy transfer models? In this study, we propose the Cross-Domain Trajectory EDiting (xTED) framework that employs a specially designed diffusion model for cross-domain trajectory adaptation. Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data. By utilizing the pre-trained diffusion as a prior, source domain trajectories can be transformed to match with target domain properties while preserving original semantic information. This process implicitly corrects underlying domain gaps, enhancing state realism and dynamics reliability in the source data, and allowing flexible incorporation with various downstream policy learning methods. Despite its simplicity, xTED demonstrates superior performance in extensive simulation and real-robot experiments.

Updated Dynamics Error Distribution for Edited Source Data

Updated dynamics error distribution plots, where the results of 'Tgt' are evaluated on the mixture of HalfCheetah MR, ME, and M datasets from D4RL, and the results of 'Src (Edited)' are evaluated on the mixture of source data edited by diffusion models trained on all the HalfCheetah MR, ME and M datasets.

real world experiment

Figure 1

real world experiment

Figure 2

real world experiment

Figure 3

Domain Gap Data MSE Error (Mean ± Std) MAE Error (Mean ± Std)
Gravity Tgt 0.58 ± 1.75 0.33 ± 0.21
Src (Edited) 1.02 ± 1.81 0.45 ± 0.23
Src 4.62 ± 4.53 1.01 ± 0.44
Thigh Size Tgt 0.58 ± 1.75 0.33 ± 0.21
Src (Edited) 1.18 ± 2.20 0.49 ± 0.29
Src 3.88 ± 3.35 1.02 ± 0.46
Friction Tgt 0.58 ± 1.75 0.33 ± 0.21
Src (Edited) 1.61 ± 2.20 0.61 ± 0.32
Src 5.54 ± 3.36 1.18 ± 0.37

Table 1: Numerical results of dynamics errors of source data, edited source data and target data.

Replaying Edited and Original Source Trajectory Pairs

We select two edited and orginal source trajectory pairs (index [10] and [100] in datasets) for each domain gap in HalfCheetah environment.

Gravity Videos

1. HalfCheetah Original

1. HalfCheetah Edited

2. HalfCheetah Original

2. HalfCheetah Edited

Downstream Robot Manipulation Tasks

We conduct real-world experiments in robotic environments where target data is collected by (a) WidowX robot and source data is collected by (b) Airbot, for 100 trajectories respectively. We build three manipulation tasks: (1) Picking up a red cup on a silver pan (Cup); (2) Picking up a duck on a green plate (Duck); (3) Moving a pot from right to left (Pot).

real world experiment

Figure 4: The comparisons between the target domain and the source domain. Target and source domains with complicated discrepancies on embodiments and viewpoints (top) and experiment results (bottom). The top right presents the snapshots from base and wrist camera views of data collection processes in target/source domain from Cup/Duck/Pot tasks respectively. The average success rate for real-robot tasks with/without distractors is obtained over 3 seeds.

Results on Real Robots

Red cup on silver pan
Duck on green plate
Move pot

Figure: Real robot experimental results. Success rate is averaged over 10 episodes and 3 seeds.

Target + Edited Source (xTED)

Red cup on silver pan

Red cup on silver pan with distraction

Duck on green plate

Duck on green plate with distraction

Move pot

Move pot with distraction

Target

Red cup on silver pan

Red cup on silver pan with distraction

Duck on green plate

Duck on green plate with distraction

Move pot

Move pot with distraction

Target + Source

Red cup on silver pan

Red cup on silver pan with distraction

Duck on green plate

Duck on green plate with distraction

Move pot

Move pot with distraction

BibTeX


        @inproceedings{anonymous2024xted,
          title={xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing},
          author={Authors, Anonymous},
          booktitle={Under Review}
        }