Sunday, February 17, 2008

Project 2

by Luis Daniel Ballesteros

Goal:
The goal of this project was to develop an algorithm for the RS Media to learn to perform some action. After analyzing the resources available for doing this, due to the awkward positions and movements of the robot's arms and body, I decided that throwing a batt at a target was the best way to go. Thus the goal of this project was to see effective learning in the context of changing behavior to result in a ball hitting a specific target.

Algorithm:
The algorithm I used for this was a mix of simple and complex. The feedback that the robot recieved from me regarding its performance was a simple too far or not far enough signal. Thus the value analysis function was pretty simple, but changed representation internally depending on the context.

To alter its behavior to perform better, the robot had a set of actions that it could alter. The base movement for the throw was a combination of swinging the wait, lifting the arm, and twisting the wrist. Thus the variables that could be changed in this system were speed and distance of waist turning, speed and height of arm, initially, and during the throw, and finally, when to twist the wrist. The combination of these variables created a complex action space that the robot could explore.

Upon initial failure, the robot would, if too far, or too close, adjust one unit in each of the ways it could change, making the throw one unit closer or farther on each trial. If after a full unit change in one direction, it recieved a signal to change in the other direction, it would not do a full unit's change. This is because, it might have first not been far enough, but then after analysis and change, it was too far. Going all the way back to where it started since the last change woul put it infinitely going too far and not far enough. Thus, in this case, it would adjust only one variable that it saw fit in the other direction. Consecutive feedback of this same direction would toggle which variables and combinations were changed. So for instance, if it went too far, maybe it wouldnt lift its arm as high next time, and if that was too far, it reset the arm variable and twisted the wrist sooner. It would eventually try a combination of lower arm and earlier wrist, and so on until it reached one full unit in backwards direction, at which point it would reset the incremental changes back to full unit changes. This way, if you move the target, it can make full unit changes to find it again.

Thus this algorithm was just a complex game of hot and cold. It received feedback from its left and right feet for too close or too far, and you could hit one of the back triggers on the feet to keep the movement the same, or quit the demonstration. However, learning is not saved across sessions, so turning off the robot or quitting require relearning everything all over again when it starts up.

Results:
I was surprised by the results of this algorithm in the fact that it was actually quite effective, though it did take a few throws to get it just right. I enjoyed the fact that I could move the cup elsewhere and the robot would be able to find it again.

The only dissapointment I had with the results was the fact that the robot couldn't throw very far, so this was more an excersize of precisely dropping the ball with a bit of speed than it really was of throwing it. Other than that, this was a success

Obstacles:
A very sad part of this hole project is that the entire throwing mechanic is very hacked together from what little control is available. For instance, the library has no functions for direct control of the hand. All functions that do something with the hand trigger long sequences and movements, like reaching out and requesting an object, or defieantly making a motion to drop the object. There were no functions that could be invoked during other movements since they all initiated their own complete macros for the whole robot. That being the case, I could not have the robot actually grab the ball, but luckily enough, the ball fit on top of his hand, in between the thumb and forefinger.

At this point, all I had to do was get the palm in the right orientation so that the ball could fit in this groove. Unfortunately, again, there are no functions which directly control movement of the wrist...I found only 2. One twists the wrist all the way up, the other twists it all the way down. Thus to prepare the ball in his hand, I had to have him twist it all the way up, and then physically move his wrist to a level position and place the ball on top of his hand. Then, when he threw it, I simply had him spin his wrist back down and flick it off of the toop while it was moving. Combined with waist and arm movement, this made somewhat of a throw...

In the end though, needing to physically level the hand smudged the data and the learning a bit since this step was not guaranteed to be the same, thus it added a level of randomness and uncertainty to the learning. Some may see this as a good thing, though...

Experience:
Overall, I once again had fun having the robot do something interesting. I enjoyed showing my friends what he could do and was even better at the game than some other people at times... I think both the robot and I learned something from this project.

Todo:
It would have been nice to have some mechanic for gauging by how much he missed and what direction, rather than a just too far or not far enough gauge. This would have allowed the robot to have a better value function in the algorithm and thus be able to learn more efficiently.

Links:
Here is a video of the robot in action. He is talented!

No comments: