Researchers use large language models to help robots navigate
The method uses language-based inputs instead of costly visual data to direct a robot through a multistep navigation task.
The method uses language-based inputs instead of costly visual data to direct a robot through a multistep navigation task.
DenseAV, developed at MIT, learns to parse and understand the meaning of language just by watching videos of people talking, with potential applications in multimedia search, language learning, and robotics.
The technique characterizes a material’s electronic properties 85 times faster than conventional methods.
A new approach could streamline virtual training processes or aid clinicians in reviewing diagnostic videos.
“Alchemist” system adjusts the material attributes of specific objects within images to potentially modify video game models to fit different environments, fine-tune VFX, and diversify robotic training.
Fifteen new faculty members join six of the school’s academic departments.
Associate Professor Jonathan Ragan-Kelley optimizes how computer graphics and images are processed for the hardware of today and tomorrow.
Three neurosymbolic methods help language models find better abstractions within natural language, then use those representations to execute complex tasks.
MIT Sea Grant students apply machine learning to support local aquaculture hatcheries.
Novel method makes tools like Stable Diffusion and DALL-E-3 faster by simplifying the image-generating process to a single step while maintaining or enhancing image quality.
FeatUp, developed by MIT CSAIL researchers, boosts the resolution of any deep network or visual foundation for computer vision systems.
By enabling models to see the world more like humans do, the work could help improve driver safety and shed light on human behavior.
The team used machine learning to analyze satellite and roadside images of areas where small farms predominate and agricultural data are sparse.
The ambient light sensors responsible for smart devices’ brightness adjustments can capture images of touch interactions like swiping and tapping for hackers.
PhD students interning with the MIT-IBM Watson AI Lab look to improve natural language usage.