Where is my robot butler?

  • Why is everyone using language models?
  • Why use learning based methods when control works?
  • What's with Foundation models for robotics?
  • Will GPT-4, 5, 6? be a robot? 😱
Household robots need to move beyond simple programmed tasks like those a Roomba performs and become full-fledged digital assistants.

Bring your own robot, let's add language to your research

A robotic agent that exists (physically) in the world, gains access to rich and personalized knowledge of its environment. For example, they might be able to answer questions like: How much do things weigh? What's fragile? Or where you store the extra chocolates that you don't want anyone to find? Building an agent that can accomplish tasks requires the integration of a diverse set of technologies and engineering. Language models, SLAM, semantic mapping, task planning, understanding object affordances, and end effector control. This course will cover both foundational works in grounding language to action and analyze (or reimplement) state-of-the-art Large Language Model based task planners. This area is fascinating and difficult because it is so cross-cutting. Readings and topics are pulled from Robotics (CoRL, ICRA, RSS, HRI, IROS), Computer Vision (CVPR, I/ECCV), Natural Language Processing (ACL, EMNLP), and Machine Learning (ICLR, ICML, NeurIPS).

Projects will be scoped by prior hardware/simulator experience -- but knowledge of Deep Learning + one specialty (NLP/CV/Robotics) is basically required. Send Qs to Yonatan (ybisk@cs).

Topics
  • LLMs & Foundation Models
  • Instruction following & Dialogue
  • Task and Motion Planning
  • End-Effector & real-valued control
  • Semantic Mapping (2D and 3D)
  • World Models
Questions
  • How do you define or evaluate Dialogue?
  • Limitations of offline and unimodal pretraining
  • How does embodiment shape meaning?
  • Discrete vs continuous spaces and representations.
  • When is Sim2Real possible? What's about manipulation?
  • I only have one brain, do I need more than one model?
Logistics
  • Time & Place: 3:30pm - 4:50pm Tu/Th -- DH 1212
  • Canvas (probably)
  • Instructor: Yonatan Bisk
  • Zoom? Nope
  • Readings? The library is your friend
  • Slides? No ma'am

Course Schedule

Tues Topic Thurs Topic
Aug 26Dualism Aug 28Symbol Grounding
Sep 2 Before 1990 + Prize Sep 4 Defining intelligence
Sep 9 Building Worlds Sep 11Linguistic Structure
Sep 16 Group Pitches Sep 18 Perspective & Red team
Sep 23Space and Time Sep 25Continuous
Sep 30Mapping & Planning Oct 2 Real valued + Sim2Real
Oct 7 States & Tasks Oct 9 Mid-term presentations
Oct 14Fall Break Oct 16Fall Break
Oct 21Manipulation Oct 23Imagination
Oct 28Concept Learning Oct 30Pragmatics and Dialogue
Nov 4 Democracy Day Nov 6 Theory of Mind
Nov 11 Coming soon Nov 13 Coming soon
Nov 18Project Hours (virtual) Nov 20Project Hours (virtual)
Nov 25Philosophy Nov 27Thanksgiving
Dec 2 Project Presentations Dec 4 Philosophy Presentations

Course Structure and Grading

This course is available as both a seminar (6 credits) and project based (12 credits) course.
Assignment Grades Schedule
Definitions 3*10 (indiv) Most weeks
Start Pitches 5 (grougroup) 9/18
Perspective + Red team 5 (group)
Mid-semester Presentation 5 (group) 10/9
Report 10 (group)
Feedback 5 (group)
Project Hours Key Experiments 5 (group) 11/20
End of Semester Presentation (Philosophy) 5 (group) 12/5
Presentation (Project) 10 (group)
Report 20 (group)
Proposal: Both seminar and project based students will write a proposal. While project students will go on to work on implementation, the seminar students should also go through the mental exercise of planning out what a system needs, what dependencies components have, where gradients might flow, etc. They will then get to revise their understanding in their final report.

Groups: Both seminar and project based assignments will be done in groups. Groups will likely be capped at five people.

Equal Participation: All reports must include a breakdown of each teammate's contributions.

Project Pitch (5pts)
Midsemester Presentation (5pts)
Final Presentation (10pts)
Final Report (20pts)

Relevant Related Courses

Course Policies

Late Assignments

COVID Details:

In the event a student tests positive for COVID-19, they will be invited to attend discussion virtually and will be expected to participate as usual. This includes participation points for raising their hands with questions/answers and submission of lab-notebooks. Note, that students who attend class while exhibiting symptoms will be told to leave and join virtually for the protection of all others present.

Accommodations for Students with Disabilities:

If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with the instructors as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to contact them at access@andrew.cmu.edu.

FAQ

  1. Can we use our own platforms? Yes! What robots do you have? Also checkout AI Maker Space
  2. What about custom sensors and hardware? Same answer :)
  3. What about other simulators? Same answer :)
  4. What about Web Agents? If multimodal and multi-turn
  5. LTI Curriculum Categories? 12 Hour version can be counted for a Task and a Lab
  6. Do I /need/ simulator experience? No, but plan to spend some time getting the engineering setup
  7. Can I attend discussion without registering? It's best to register (6hrs) even if you've finished your classes, since I need to prioritize time, energy, and space on registered students. I'll try and update this once I have a room confirmed with the registrar and see how much space we have in the class.

Some papers