Machine Learning Simulation: Torso Dynamics of Robotic Biped
dc.contributor.author | Renner, Michael Robert | en |
dc.contributor.committeechair | Granata, Kevin P. | en |
dc.contributor.committeemember | Hong, Dennis W. | en |
dc.contributor.committeemember | Kasarda, Mary E. | en |
dc.contributor.committeemember | Reinholtz, Charles F. | en |
dc.contributor.committeemember | Sandu, Corina | en |
dc.contributor.department | Mechanical Engineering | en |
dc.date.accessioned | 2014-03-14T20:43:40Z | en |
dc.date.adate | 2007-08-22 | en |
dc.date.available | 2014-03-14T20:43:40Z | en |
dc.date.issued | 2007-08-10 | en |
dc.date.rdate | 2007-08-22 | en |
dc.date.sdate | 2007-08-17 | en |
dc.description.abstract | Military, Medical, Exploratory, and Commercial robots have much to gain from exchanging wheels for legs. However, the equations of motion of dynamic bipedal walker models are highly coupled and non-linear, making the selection of an appropriate control scheme difficult. A temporal difference reinforcement learning method known as Q-learning develops complex control policies through environmental exploration and exploitation. As a proof of concept, Q-learning was applied through simulation to a benchmark single pendulum swing-up/balance task; the value function was first approximated with a look-up table, and then an artificial neural network. We then applied Evolutionary Function Approximation for Reinforcement Learning to effectively control the swing-leg and torso of a 3 degree of freedom active dynamic bipedal walker in simulation. The model began each episode in a stationary vertical configuration. At each time-step the learning agent was rewarded for horizontal hip displacement scaled by torso altitude--which promoted faster walking while maintaining an upright posture--and one of six coupled torque activations were applied through two first-order filters. Over the course of 23 generations, an approximation of the value function was evolved which enabled walking at an average speed of 0.36 m/s. The agent oscillated the torso forward then backward at each step, driving the walker forward for forty-two steps in thirty seconds without falling over. This work represents the foundation for improvements in anthropomorphic bipedal robots, exoskeleton mechanisms to assist in walking, and smart prosthetics. | en |
dc.description.degree | Master of Science | en |
dc.identifier.other | etd-08172007-151826 | en |
dc.identifier.sourceurl | http://scholar.lib.vt.edu/theses/available/etd-08172007-151826/ | en |
dc.identifier.uri | http://hdl.handle.net/10919/34602 | en |
dc.publisher | Virginia Tech | en |
dc.relation.haspart | etd_part3.pdf | en |
dc.relation.haspart | etd_part2.pdf | en |
dc.relation.haspart | etd_part1.pdf | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | Dynamic Bipedal Walking | en |
dc.subject | Reinforcement Learning | en |
dc.subject | Q-Learning | en |
dc.subject | Torso | en |
dc.subject | NEAT+Q | en |
dc.title | Machine Learning Simulation: Torso Dynamics of Robotic Biped | en |
dc.type | Thesis | en |
thesis.degree.discipline | Mechanical Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |