Expanding robot perception
Associate Professor Luca Carlone is working to give robots a more human-like awareness of their environment.
Associate Professor Luca Carlone is working to give robots a more human-like awareness of their environment.
Biodiversity researchers tested vision systems on how well they could retrieve relevant nature images. More advanced models performed well on simple queries but struggled with more research-specific prompts.
The “PRoC3S” method helps an LLM create a viable action plan by testing each step in a simulation. This strategy could eventually aid in-home robots to complete more ambiguous chore requests.
Researchers propose a simple fix to an existing technique that could help artists, designers, and engineers create better 3D models.
The method could help communities visualize and prepare for approaching storms.
MIT CSAIL researchers used AI-generated images to train a robot dog in parkour, without real-world data. Their LucidSim system demonstrates generative AI's potential for creating robotics training data.
A new method can train a neural network to sort corrupted data while anticipating next steps. It can make flexible plans for robots, generate high-quality video, and help AI agents navigate digital environments.
New dataset of “illusory” faces reveals differences between human and algorithmic face detection, links to animal face recognition, and a formula predicting where people most often perceive faces.
A new method called Clio enables robots to quickly map a scene and identify the items they need to complete a given set of tasks.
A new algorithm helps robots practice skills like sweeping and placing objects, potentially helping them improve at important tasks in houses, hospitals, and factories.
CSAIL researchers introduce a novel approach allowing robots to be trained in simulations of scanned home environments, paving the way for customized household automation accessible to anyone.
MAIA is a multimodal agent that can iteratively design experiments to better understand various components of AI systems.
This technique could lead to safer autonomous vehicles, more efficient AR/VR headsets, or faster warehouse robots.
LLMs trained primarily on text can generate complex visual concepts through code with self-correction. Researchers used these illustrations to train an image-free computer vision system to recognize real photos.
The method uses language-based inputs instead of costly visual data to direct a robot through a multistep navigation task.