Humanoids

2015 DARPA Robotics Challenge

Selected Publication:

Joshua James, Yifan Weng, Stephen Hart, Patrick Beeson, and Robert Burridge. Prophetic Goal-Space Planning for Human-in-the-Loop Mobile Manipulation. In Proceedings of the IEEE RAS Humanoids Conference, Seoul, Korea, 2015.


Inverse Kinematics

Selected Publication:

Patrick Beeson and Barrett Ames. TRAC-IK: An Open-Source Library for Improved Solving of Generic Inverse Kinematics. In Proceedings of the IEEE RAS Humanoids Conference, Seoul, Korea, 2015.


Activity Recognition

Automated video processing techniques can be integrated into a comprehensive toolbox that drastically reduces the time to search and analyze videos. This would allow specific regions in a video stream to be isolated for monitoring, which can provide quick indexing for human viewing of all motion-based activity in the area of a vehicle. It could also allow the user to query for specific activities or events that occurred in this region. These could be automatically detected by software and presented directly to the user.

We are looking to design a system that detects and tracks humans, human activity, human-station interaction, and team interactions using existing cameras and videos. Our overall objectives can be achieved by developing a suite of algorithms that can handle several key sub-challenges: 1) Robustly handling unconstrained video content and capture conditions; 2) Extracting functional descriptions of complex human events; 3) Handling ad hoc event queries effectively; 4) Operating efficiently, so the system can keep up with the flood of videos being added to current databases and provide effective interactive search over such databases.

Selected Publication:

Patrick Beeson, Nicholas Barrash, and Brian Burns. Perception Engine for Activity Recognition and Logging using Manual Procedure Instructions. In Proceedings of the International Symposium on Artificial Intelligence, Robotics and Automation in Space (i-SAIRAS), Turin, Italy, September 2012.


Autonomous Vehicles

2007 DARPA Urban Challenge

Austin Robot Technology (ART) formed as a collection of technologists interested in participating in the 2005 Grand Challenge, a race through the desert. In their spare time and with much of their own money, they created an autonomous vehicle that made it to the semi-finals in 2005. The same vehicle that was designed and engineered as a part of the 2005 Grand Challenge provided the starting point for ART's entry into the Urban Challenge.

For the 2007 competition, Austin Robot Technology teamed up with The University of Texas at Austin (UT Austin) via Dr. Peter Stone's spring 2007 undergraduate course. This partnership provided the team with Artificial Intelligence expertise as well as a group of excited undergraduate programmers. It provided the university with an interesting platform on which to offer invaluable undergraduate research opportunities.

Austin Robot Technology's entry in the DARPA Urban Challenge had two main goals. First, the team aimed to create a fully autonomous vehicle that is capable of safely and robustly meeting all of the criteria laid out in DARPA's Technical Evaluation Criteria document. Second, the team aimed to educate and train members of the next generation of computer science and robotics researchers by encouraging and facilitating extensive participation by undergraduate programmers.

The currently declining enrollments in computer science undergraduate programs across the country is a potential crisis for the nation's future IT industry, as well as for government research agencies such as DARPA. Inspiring challenge problems may be essential drivers towards reversing this enrollment trend. As such, we believe strongly that the long-term impact of competitions such as the Urban Challenge will be as much in education and training as it will be in technological innovation.

Creating an autonomous vehicle with an undergraduate team biased our algorithms to be as straightforward as possible. Nonetheless, the algorithms described in this paper were reliable enough for our team to have placed among the top twenty-one teams at the Urban Challenge National Qualifying Event (NQE).

I am continuing this instructional stream, focusing on incremental improvements to the existing system. The FRI expects this undergraduate course to continue for at least several more years, as the University of Texas at Austin continues its push to get undergraduates involved in real-world research.

Selected Publication:

Patrick Beeson, Jack O'Quin, Bartley Gillan, Tarun Nimmagadda, Mickey Ristroph, David Li, and Peter Stone. Multiagent interactions in urban driving. Journal of Physical Agents, 2(1):15-30, March 2008. Special issue on Multi-Robot Systems


Intelligent Wheelchairs

Smart wheelchairs that can detect and avoid obstacles have been developed with the goal of serving as mobility aids for persons with disabilities who find standard power wheelchairs difficult or impossible to use. By improving mobility and autonomy, they have the potential to improve the health of populations ranging from the severely disabled to the growing numbers of aging people.

We propose to create and evaluate an Intelligent Wheelchair that represents a qualitative increase in capability, based on state-of-the-art methods for robot exploration, map-building, navigation, and direction following. The Intelligent Wheelchair acts under the direction of its human driver, but it is also an intelligent robot, sensing its local surroundings and maintaining a "cognitive map" of its environment. The proposed research is driven by the structure and requirements of the human-robot interface by which the human driver instructs the Intelligent Wheelchair where it should go.

Selected Publication:

Patrick Beeson, Matt MacMahon, Joseph Modayil, Aniket Murarka, Benjamin Kuipers, and Brian Stankiewicz. Integrating multiple representations of spatial knowledge for mapping, navigation, and communication. In Proceedings of the Symposium on Interaction Challenges for Intelligent Assistants, AAAI Spring Symposium Series, Stanford, CA, 2007. AAAI Technical Report SS-07-04


Foundational Learning

Common sense, and hence most other human knowledge, is built on knowledge of a few foundational domains, such as space, time, action, objects, causality, and so on. We are investigating how this knowledge can be learned from unsupervised sensorimotor experience. We assume that an agent, human or robot, starts with a low-level ontology for describing its sensorimotor interaction with the world. We call this the "pixel level". William James called it the "blooming buzzing confusion". The learning task is to create useful higher-level representations for space, time, actions, objects, etc, to support effective planning and action in the world.

The basic idea behind foundational learning is to compose multiple machine learning methods, using weak but general unsupervised or delayed-reinforcement learning methods to create the prerequisites for applying stronger but more specific learning methods such as abductive inference or supervised learning. An important common theme of all this work is the learning of a higher level ontology of places, objects, and their relationships, based on the low-level "pixel ontology" of direct experience. These learning methods create new symbols and categories, solving the symbol grounding problem for these symbols, and defining the symbols in terms of the agent's own experience, not the experience of an external teacher or programmer.

Selected Publication:

Benjamin Kuipers and Patrick Beeson. Bootstrap learning for place recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 174-180, Edmonton, Canada, 2002.


Gesture Recognition

Gesture recognition is an important communication modality for a variety of human-robot applications, including mobile robotics and ambient intelligence domains. Most gesture recognition systems focus on estimating the position of the arm with respect to the torso of a tracked human. There is an alternative approach to gesture recognition that focuses on reliable detection of time-dependent, cyclic "beats" given by a human user. While the expressiveness of "beat-based" gestures is limited, beat-based gesture recognition has several benefits, including reliable 2D gesture detection at far ranges, gesture detection anywhere in the image frame, detection when the human is mostly hidden or obscured, and secure detection via randomly rotated beat patterns that are known only by the user and the perception system.

Selected Publication:

Graylin Trevor Jay, Patrick Beeson, and Odeste Chadwicke Jenkins. Beat-based gesture recognition for non-secure, far-range, or obscured perception scenarios. In IJCAI Workshop on Space, Time and Ambient Intelligence (STAMI), Barcelona, Spain, July 2011.


Hybrid Spatial Semantic Hierarchy (HSSH)

The Spatial Semantic Hierarchy (SSH) provides a hierarchy of abstractions for reasoning about large-scale space. It assumes that the large-scale environments in which the agent operates have a certain structure, specifically that the environments can be described as collections of places connected by paths. The Hybrid Spatial Semantic Hierarchy (HSSH) is a more specific framework of spatial knowledge that differs from the SSH in several key ways---most importantly, it incorporates knowledge of small-scale space.

The HSSH allows a robot to describe the world using qualitatively different representations, each with its own ontology. The hierarchy of connected representations is useful for the many tasks of navigation: safe motion, localization, map-building, and route planning. Equally important, since the multiple representations are motivated by human cognitive abilities, they provide a "natural" way for a robot to interact with a human.

To solve the mapping problem, an autonomous robot explores an unknown environment and uses its own observations to build a useful map. Maps have a variety of different uses, including route planning, local motion control with hazard avoidance, estimating distances and directions, localization, and place recognition. Although important progress has been made on the SLAM (simultaneous localization and mapping) problem within a single global frame of reference, metrical uncertainty can still accumulate over time, making it difficult to close large loops with confidence.

In recent work to address this problem, we have revised the basic Spatial Semantic Hierarchy (SSH) to become the Hybrid SSH, by defining a clean interface to the local perceptual map. We exploit the strengths of three different map representations to factor the mapping problem into three distinct sub-problems that can be solved reliably: local metrical mapping in the scrolling local perceptual map; global topological mapping given the local metrical maps, generating a tree of all possible maps to resolve structural ambiguities; a global metrical map, given the local metrical and global topological maps.

By factoring the problem in this way, we can build an accurate global metrical map on the skeleton provided by the accurate global topological map. The factored problem also leads to a robust and useful map: the local metrical map is useful for place recognition and local motion control with hazard avoidance; the global topological map is useful for localization and route planning; and the global metrical map is useful for estimating distances and directions.

Selected Publication:

Patrick Beeson, Joseph Modayil, and Benjamin Kuipers. Factoring the mapping problem: Mobile robot map-building in the Hybrid Spatial Semantic Hierarchy. International Journal of Robotics Research, 29(4):428-459, April 2010.

Research Projects

Humanoids

Activity Recognition

Foundational Learning

Gesture Recognition

Hybrid Spatial Semantic Hierarchy