martial hebert

martial hebert

Carnegie Mellon University

H-index: 122

North America-United States

Description

martial hebert, With an exceptional h-index of 122 and a recent h-index of 65 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of Computer Vision, Robotics.

Professor Information

University

Carnegie Mellon University

Position

___

Citations(all)

61130

Citations(since 2020)

19958

Cited By

51552

hIndex(all)

122

hIndex(since 2020)

65

i10Index(all)

372

i10Index(since 2020)

194

Email

University Profile Page

Carnegie Mellon University

Research & Interests List

Computer Vision

Robotics

Top articles of martial hebert

Geodesic turnpikes for robot motion planning

Endowing the configuration space of a robot with an appropriate metric structure and characterizing and computing the corresponding geodesics are central issues in motion planning. As recently observed in [1], the geodesics of SE(2) equipped with the so-called minimum swept-volume distance exhibit in practice a behavior akin to the turnpike property in optimal control, with transient phases separated by a longer steady state close to prototypical trajectories, the turnpikes [2]. This presentation gives a theoretical counterpoint to this empirical observation with a formal definition of geodesic turnpikes using vector fields on Finsler manifolds, a simple differential characterization of geodesics in the case where the manifold is a Lie group and the Finsler distance is left-invariant, and, in the case where the corresponding operator is also reversible, a conjecture characterizing the turnpikes by vector fields satisfying simple conditions in the corresponding Lie algebras. As a proof of concept, closed-form (resp. numerical) procedures for computing these vector fields according to this conjecture are given for SE(2) equipped with the left-invariant Riemannian (resp. minimum swept-volume) distance introduced in [3] (resp. [1]) for rectangular shapes. The solutions empirically match, in both cases, the observed turnpike behavior of the corresponding geodesics. In the minimum swept- volume distance case, using the turnpikes for initialization also yields an order-of-magnitude speedup in computing geodesics.

Authors

Yann de Mont-Marin,Martial Hebert,Jean Ponce

Published Date

2024/3/1

Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation. This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps. While previous research efforts have individually tackled these issues, we assert that a holistic approach is paramount. Thus, we propose two novel objectives, the Separate loss and the Enhance loss, that reduce object mask overlaps and maximize attention scores, respectively. Our method diverges from conventional test-time-adaptation techniques, focusing on finetuning critical parameters, which enhances scalability and generalizability. Comprehensive evaluations demonstrate the superior performance of our model in terms of image realism, text-image alignment, and adaptability, notably outperforming prominent baselines. Ultimately, this research paves the way for T2I diffusion models with enhanced compositional capacities and broader applicability. The project webpage is available at https://zpbao.github.io/projects/SepEn/.

Authors

Zhipeng Bao,Yijun Li,Krishna Kumar Singh,Yu-Xiong Wang,Martial Hebert

Journal

arXiv preprint arXiv:2312.06712

Published Date

2023/12/10

Discovering multiple algorithm configurations

Many practitioners in robotics regularly depend on classic, hand-designed algorithms. Often the performance of these algorithms is tuned across a dataset of annotated examples which represent typical deployment conditions. Automatic tuning of these settings is traditionally known as algorithm configuration. In this work, we extend algorithm configuration to automatically discover multiple modes in the tuning dataset. Unlike prior work, these configuration modes represent multiple dataset instances and are detected automatically during the course of optimization. We propose three methods for mode discovery: a post hoc method, a multistage method, and an online algorithm using a multi-armed bandit. Our results characterize these methods on synthetic test functions and in multiple robotics application domains: stereoscopic depth estimation, differentiable rendering, motion planning, and visual odometry. We show …

Authors

Leonid Keselman,Martial Hebert

Published Date

2023/5/29

Optimizing Algorithms From Pairwise User Preferences

Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an implicit reward function; however, this reward may be difficult or impossible to effectively capture. In this work, we introduce SortCMA to optimize algorithm parameter configurations in high dimensions based on pairwise user preferences. SortCMA efficiently and robustly leverages user input to find parameter sets without directly modeling a reward. We apply this method to tuning a commercial depth sensor without ground truth, and to robot social navigation, which involves highly complex preferences over robot behavior …

Authors

Leonid Keselman,Katherine Shih,Martial Hebert,Aaron Steinfeld

Published Date

2023/10/1

Deep Projective Rotation Estimation through Relative Supervision

Orientation estimation is the core to a variety of vision and robotics tasks such as camera and object pose estimation. Deep learning has offered a way to develop image-based orientation estimators; however, such estimators often require training on a large labeled dataset, which can be time-intensive to collect. In this work, we explore whether self-supervised learning from unlabeled data can be used to alleviate this issue. Specifically, we assume access to estimates of the relative orientation between neighboring poses, such that can be obtained via a local alignment method. While self-supervised learning has been used successfully for translational object keypoints, in this work, we show that naively applying relative supervision to the rotational group will often fail to converge due to the non-convexity of the rotational space. To tackle this challenge, we propose a new algorithm for self-supervised orientation estimation which utilizes Modified Rodrigues Parameters to stereographically project the closed manifold of to the open manifold of , allowing the optimization to be done in an open Euclidean space. We empirically validate the benefits of the proposed algorithm for rotational averaging problem in two settings:(1) direct optimization on rotation parameters, and (2) optimization of parameters of a convolutional neural network that predicts object orientations from images. In both settings, we demonstrate that our proposed algorithm is able to converge to a consistent relative orientation frame much faster than algorithms that purely operate in the space. Additional information can be found at https://sites. google. com/view …

Authors

Brian Okorn,Chuer Pan,Martial Hebert,David Held

Published Date

2023/3/6

Flexible techniques for differentiable rendering with 3d gaussians

Fast, reliable shape reconstruction is an essential ingredient in many computer vision applications. Neural Radiance Fields demonstrated that photorealistic novel view synthesis is within reach, but was gated by performance requirements for fast reconstruction of real scenes and objects. Several recent approaches have built on alternative shape representations, in particular, 3D Gaussians. We develop extensions to these renderers, such as integrating differentiable optical flow, exporting watertight meshes and rendering per-ray normals. Additionally, we show how two of the recent methods are interoperable with each other. These reconstructions are quick, robust, and easily performed on GPU or CPU. For code and visual examples, see https://leonidk.github.io/fmb-plus

Authors

Leonid Keselman,Martial Hebert

Journal

arXiv preprint arXiv:2308.14737

Published Date

2023/8/28

Multi-task View Synthesis with Neural Radiance Fields

Multi-task visual learning is a critical aspect of computer vision. Current research, however, predominantly concentrates on the multi-task dense prediction setting, which overlooks the intrinsic 3D world and its multi-view consistent structures, and lacks the capacity for versatile imagination. In response to these limitations, we present a novel problem setting--multi-task view synthesis (MTVS), which reinterprets multi-task prediction as a set of novel-view synthesis tasks for multiple scene properties, including RGB. To tackle the MTVS problem, we propose MuvieNeRF, a framework that incorporates both multi-task and cross-view knowledge to simultaneously synthesize multiple scene properties. MuvieNeRF integrates two key modules, the Cross-Task Attention (CTA) and Cross-View Attention (CVA) modules, enabling the efficient use of information across multiple views and tasks. Extensive evaluations on both synthetic and realistic benchmarks demonstrate that MuvieNeRF is capable of simultaneously synthesizing different scene properties with promising visual quality, even outperforming conventional discriminative models in various settings. Notably, we show that MuvieNeRF exhibits universal applicability across a range of NeRF backbones. Our code is available at https://github. com/zsh2000/MuvieNeRF.

Authors

Shuhong Zheng,Zhipeng Bao,Martial Hebert,Yu-Xiong Wang

Published Date

2023

Object discovery from motion-guided tokens

Object discovery--separating objects from the background without manual labels--is a fundamental open challenge in computer vision. Previous methods struggle to go beyond clustering of low-level cues, whether handcrafted (eg, color, texture) or learned (eg, from auto-encoders). In this work, we augment the auto-encoder representation learning framework with two key components: motion-guidance and mid-level feature tokenization. Although both have been separately investigated, we introduce a new transformer decoder showing that their benefits can compound thanks to motion-guided vector quantization. We show that our architecture effectively leverages the synergy between motion and tokenization, improving upon the state of the art on both synthetic and real datasets. Our approach enables the emergence of interpretable object-specific mid-level features, demonstrating the benefits of motion-guidance (no labeling) and quantization (interpretability, memory efficiency).

Authors

Zhipeng Bao,Pavel Tokmakov,Yu-Xiong Wang,Adrien Gaidon,Martial Hebert

Published Date

2023

Professor FAQs

What is martial hebert's h-index at Carnegie Mellon University?

The h-index of martial hebert has been 65 since 2020 and 122 in total.

What are martial hebert's research interests?

The research interests of martial hebert are: Computer Vision, Robotics

What is martial hebert's total number of citations?

martial hebert has 61,130 citations in total.

What are the co-authors of martial hebert?

The co-authors of martial hebert are Stefan Holzer.

Co-Authors

H-index: 40
Stefan Holzer

Stefan Holzer

Technische Universität München

academic-engine

Useful Links