by Luis Daniel Ballesteros
Goal:
The goal of this project was to develop an algorithm for the RS Media to learn to perform some action. After analyzing the resources available for doing this, due to the awkward positions and movements of the robot's arms and body, I decided that throwing a batt at a target was the best way to go. Thus the goal of this project was to see effective learning in the context of changing behavior to result in a ball hitting a specific target.
Algorithm:
The algorithm I used for this was a mix of simple and complex. The feedback that the robot recieved from me regarding its performance was a simple too far or not far enough signal. Thus the value analysis function was pretty simple, but changed representation internally depending on the context.
To alter its behavior to perform better, the robot had a set of actions that it could alter. The base movement for the throw was a combination of swinging the wait, lifting the arm, and twisting the wrist. Thus the variables that could be changed in this system were speed and distance of waist turning, speed and height of arm, initially, and during the throw, and finally, when to twist the wrist. The combination of these variables created a complex action space that the robot could explore.
Upon initial failure, the robot would, if too far, or too close, adjust one unit in each of the ways it could change, making the throw one unit closer or farther on each trial. If after a full unit change in one direction, it recieved a signal to change in the other direction, it would not do a full unit's change. This is because, it might have first not been far enough, but then after analysis and change, it was too far. Going all the way back to where it started since the last change woul put it infinitely going too far and not far enough. Thus, in this case, it would adjust only one variable that it saw fit in the other direction. Consecutive feedback of this same direction would toggle which variables and combinations were changed. So for instance, if it went too far, maybe it wouldnt lift its arm as high next time, and if that was too far, it reset the arm variable and twisted the wrist sooner. It would eventually try a combination of lower arm and earlier wrist, and so on until it reached one full unit in backwards direction, at which point it would reset the incremental changes back to full unit changes. This way, if you move the target, it can make full unit changes to find it again.
Thus this algorithm was just a complex game of hot and cold. It received feedback from its left and right feet for too close or too far, and you could hit one of the back triggers on the feet to keep the movement the same, or quit the demonstration. However, learning is not saved across sessions, so turning off the robot or quitting require relearning everything all over again when it starts up.
Results:
I was surprised by the results of this algorithm in the fact that it was actually quite effective, though it did take a few throws to get it just right. I enjoyed the fact that I could move the cup elsewhere and the robot would be able to find it again.
The only dissapointment I had with the results was the fact that the robot couldn't throw very far, so this was more an excersize of precisely dropping the ball with a bit of speed than it really was of throwing it. Other than that, this was a success
Obstacles:
A very sad part of this hole project is that the entire throwing mechanic is very hacked together from what little control is available. For instance, the library has no functions for direct control of the hand. All functions that do something with the hand trigger long sequences and movements, like reaching out and requesting an object, or defieantly making a motion to drop the object. There were no functions that could be invoked during other movements since they all initiated their own complete macros for the whole robot. That being the case, I could not have the robot actually grab the ball, but luckily enough, the ball fit on top of his hand, in between the thumb and forefinger.
At this point, all I had to do was get the palm in the right orientation so that the ball could fit in this groove. Unfortunately, again, there are no functions which directly control movement of the wrist...I found only 2. One twists the wrist all the way up, the other twists it all the way down. Thus to prepare the ball in his hand, I had to have him twist it all the way up, and then physically move his wrist to a level position and place the ball on top of his hand. Then, when he threw it, I simply had him spin his wrist back down and flick it off of the toop while it was moving. Combined with waist and arm movement, this made somewhat of a throw...
In the end though, needing to physically level the hand smudged the data and the learning a bit since this step was not guaranteed to be the same, thus it added a level of randomness and uncertainty to the learning. Some may see this as a good thing, though...
Experience:
Overall, I once again had fun having the robot do something interesting. I enjoyed showing my friends what he could do and was even better at the game than some other people at times... I think both the robot and I learned something from this project.
Todo:
It would have been nice to have some mechanic for gauging by how much he missed and what direction, rather than a just too far or not far enough gauge. This would have allowed the robot to have a better value function in the algorithm and thus be able to learn more efficiently.
Links:
Here is a video of the robot in action. He is talented!
Sunday, February 17, 2008
Sunday, February 3, 2008
Humanoids Project 1
by Luis Daniel Ballesteros
Goal:
The initial idea behind this project is to make the robot dance. Of course, simply executing a series of moves is boring. I wanted the robot to learn to dance, so I devised a simple probabilistic learning algorithm. The goal of my project, then, is to see results indicative of learning in the robot's dances.
Algorithm:
The algorithm is simple, yet complicated. As units of movement, I copied all of the preset move functions from the sample dances and numbered them move0 through move120. I then wrote about 30 of my own moves, combinations of movements that seemed interesting to add to the pool, for a total of 151 possible moves. Now all the robot needed was a way of invoking the moves that are best fit for the song playing.
To chose moves, I set up a model where good moves had a higher probability of coming up than bad moves. The way I did this was giving each move a score from 0 to 99. To chose a move, I do 100 samples numbered 0 to 99. For each sample, it randomly chooses a move whose score is at least as high as the sample number and gives that move a point. Thus a move with a score of 30 has fewer potential points than one with a score of 80. At the end of 100 samples, the move with the most points is executed.
At the end of the dance the robot takes input from the user to judge how good or bad the dance was. If the dance was good, all moved performed will receive a score bonus. If it was bad, all moves performed will be reduced score. The user can also choose to vote neutrally and not change the scores.
Also note that no move ever has 0 probability since not only do I not allow repeat moves in a dance, but even moves with score 0 are considered in the draw. This is good to revive moves that may have seemed bad in context, but are actually not that bad.
Results:
I was personally surprised with the results of the learning. In the beginning, every move had a score of 50, so they all had an even chance to be chosen. However, after a few runs, it became clear that the moves I liked were happening more frequently and the ones I didn't were not. For example, some of the moves were simply sleep instructions that caused the robot to stop moving for a bit, and some involved moving his wait. I really disliked it when he stopped, and really enjoyed it when he moved his waist. I could see that as I judged and reviewed his performance, the pauses almost went away and the waist moves were in every dance.
I also noticed that the more the robot learned, the harsher I had to be with my judging. After the point where it almost never called the wait move, I had to look for other things that I liked or disliked. It seems clear that the more dances you judge, the better they get overall. But as you will see below, doing many trails was difficult, long, and cumbersome. Generally I like the dance that robosapien does on average, though with a lot more trials I could effectively weed out all of the bad moves.
Although some move's probablilies went up into the 70s, and others into the 30s, since there are so many moves, the majority of them linger in the 40s and 50s. With a lot more trials I believe that thiss method would effectively seperate the bad moves from the good moves.
However, if you watch the videos, you will see that he does a lot of moves that move his waist, as per my taste in his dances. It's pretty exciting.
Obstacles:
Although there were many things the robot could do, the things that it couldn't were severe obstacles. For example, I could not find a way to create files inside the SD card, so I had no automatic way of saving the learning done in one session. I had to use a hack where calls to System.print are redirected into a log file that has debugging information, I would print the array of saved values to the log file and then copy it as a hard coded initial value inside the code, recompile, and copy the file over to the SD card again. This became very frustrating. If there is a way to create and write to files I would much appreciate it. The java mobile libraries don't include the functions necessary to do this that normal java has, so...
Also, I don't know why, nor do i know whose fault it is, but the dance would always fail on the 5th try. this means that from bootup, I could only do 4 dances before I had to turn it off and copy the saved learning and recompile. Worse yet, sometimes the log file would not be created so I would lose the data I had spent time learning. I never could figure out why the robot stopped working on the 5th dance because I had no way to do runtime debugging, so I never knew the state of the robot when it failed. No, I could not simply write the status information to the log file because whenever the robot froze, all I could do was shut the power off and the log file would never be written. I suspect it is a stack overflow or something of that sort, but I have no way to test anything.
Combined with all of these frustrations and obstacles, this project took me more than twice as long as it should have. Better documentation of things like how the log file is created would have been of utmost utility here. Not that I blame anyone for it, we're all still getting used to this new platform.
Experience:
Aside from the extreme frustration of the programming environment, using java, and losing my data frequently, this project was a lot of fun. I enjoyed watching the robot dance and it was cool to see the results of my learning. Overall i am glad that I did this project as I learned a lot of things from it. It was awesome to make all my friends burst out in laughter when they watched the robot dance. It really is quite entertaining. Furthermore, it's exciting to make him dance multiple times because the dance is never the same, so it's interesting to see what will come up based on what he's learned.
Todo:
It would be cool to install another learning algorithm that he not only learned which moves are best, but what order to do them in. Sometimes two moves can look individually bad when done in the wrong order, but in the right order they look good. This would be good to consider, but harder to implement and test. A good idea for version 2 if it ever gets to it.
Links:
Youtube video -
This is a quick run:
In this second one you see me press his right foot at the end of the dance. This is telling him that I liked the dance he did. I figured I should show this since I abruptly stop filming in the previous video.
Java source code - Link
A note on the java - I am primarily a C coder, and the lack of function pointer made my life very difficult here. do not laugh at my switch statement with 150 cases.... If you know how to do variable method calls in java, please leave a comment, I would love to know. I hated having to write a C program to write a giant java switch statement....
Goal:
The initial idea behind this project is to make the robot dance. Of course, simply executing a series of moves is boring. I wanted the robot to learn to dance, so I devised a simple probabilistic learning algorithm. The goal of my project, then, is to see results indicative of learning in the robot's dances.
Algorithm:
The algorithm is simple, yet complicated. As units of movement, I copied all of the preset move functions from the sample dances and numbered them move0 through move120. I then wrote about 30 of my own moves, combinations of movements that seemed interesting to add to the pool, for a total of 151 possible moves. Now all the robot needed was a way of invoking the moves that are best fit for the song playing.
To chose moves, I set up a model where good moves had a higher probability of coming up than bad moves. The way I did this was giving each move a score from 0 to 99. To chose a move, I do 100 samples numbered 0 to 99. For each sample, it randomly chooses a move whose score is at least as high as the sample number and gives that move a point. Thus a move with a score of 30 has fewer potential points than one with a score of 80. At the end of 100 samples, the move with the most points is executed.
At the end of the dance the robot takes input from the user to judge how good or bad the dance was. If the dance was good, all moved performed will receive a score bonus. If it was bad, all moves performed will be reduced score. The user can also choose to vote neutrally and not change the scores.
Also note that no move ever has 0 probability since not only do I not allow repeat moves in a dance, but even moves with score 0 are considered in the draw. This is good to revive moves that may have seemed bad in context, but are actually not that bad.
Results:
I was personally surprised with the results of the learning. In the beginning, every move had a score of 50, so they all had an even chance to be chosen. However, after a few runs, it became clear that the moves I liked were happening more frequently and the ones I didn't were not. For example, some of the moves were simply sleep instructions that caused the robot to stop moving for a bit, and some involved moving his wait. I really disliked it when he stopped, and really enjoyed it when he moved his waist. I could see that as I judged and reviewed his performance, the pauses almost went away and the waist moves were in every dance.
I also noticed that the more the robot learned, the harsher I had to be with my judging. After the point where it almost never called the wait move, I had to look for other things that I liked or disliked. It seems clear that the more dances you judge, the better they get overall. But as you will see below, doing many trails was difficult, long, and cumbersome. Generally I like the dance that robosapien does on average, though with a lot more trials I could effectively weed out all of the bad moves.
Although some move's probablilies went up into the 70s, and others into the 30s, since there are so many moves, the majority of them linger in the 40s and 50s. With a lot more trials I believe that thiss method would effectively seperate the bad moves from the good moves.
However, if you watch the videos, you will see that he does a lot of moves that move his waist, as per my taste in his dances. It's pretty exciting.
Obstacles:
Although there were many things the robot could do, the things that it couldn't were severe obstacles. For example, I could not find a way to create files inside the SD card, so I had no automatic way of saving the learning done in one session. I had to use a hack where calls to System.print are redirected into a log file that has debugging information, I would print the array of saved values to the log file and then copy it as a hard coded initial value inside the code, recompile, and copy the file over to the SD card again. This became very frustrating. If there is a way to create and write to files I would much appreciate it. The java mobile libraries don't include the functions necessary to do this that normal java has, so...
Also, I don't know why, nor do i know whose fault it is, but the dance would always fail on the 5th try. this means that from bootup, I could only do 4 dances before I had to turn it off and copy the saved learning and recompile. Worse yet, sometimes the log file would not be created so I would lose the data I had spent time learning. I never could figure out why the robot stopped working on the 5th dance because I had no way to do runtime debugging, so I never knew the state of the robot when it failed. No, I could not simply write the status information to the log file because whenever the robot froze, all I could do was shut the power off and the log file would never be written. I suspect it is a stack overflow or something of that sort, but I have no way to test anything.
Combined with all of these frustrations and obstacles, this project took me more than twice as long as it should have. Better documentation of things like how the log file is created would have been of utmost utility here. Not that I blame anyone for it, we're all still getting used to this new platform.
Experience:
Aside from the extreme frustration of the programming environment, using java, and losing my data frequently, this project was a lot of fun. I enjoyed watching the robot dance and it was cool to see the results of my learning. Overall i am glad that I did this project as I learned a lot of things from it. It was awesome to make all my friends burst out in laughter when they watched the robot dance. It really is quite entertaining. Furthermore, it's exciting to make him dance multiple times because the dance is never the same, so it's interesting to see what will come up based on what he's learned.
Todo:
It would be cool to install another learning algorithm that he not only learned which moves are best, but what order to do them in. Sometimes two moves can look individually bad when done in the wrong order, but in the right order they look good. This would be good to consider, but harder to implement and test. A good idea for version 2 if it ever gets to it.
Links:
Youtube video -
This is a quick run:
In this second one you see me press his right foot at the end of the dance. This is telling him that I liked the dance he did. I figured I should show this since I abruptly stop filming in the previous video.
Java source code - Link
A note on the java - I am primarily a C coder, and the lack of function pointer made my life very difficult here. do not laugh at my switch statement with 150 cases.... If you know how to do variable method calls in java, please leave a comment, I would love to know. I hated having to write a C program to write a giant java switch statement....
Subscribe to:
Posts (Atom)