Monday, March 3, 2008

Project 3

by Luis Daniel Ballesteros

Goal:
The goal of this project was to create a vision program that can differentiate me from other people or things. I chose to broaden the scope of the program to have it differentiate one thing from another, as opposed to just saying me or not me. Ideally by the time the program is done, it will be able to determine if a thing it sees is something it has seen before or something new. If it has been seen before, it should be identified, and if it is new, it should be added to a database to be recognized in the future.

Algorithm:
To find faces in the picture or objects of interest, I used the approach discussed in class relating to knowing what is background and what is foreground. Therefore before each sample, a calibration picture must be taken and the program compares the sample to this to find new aspects of meaning in the picture.
After finding the point of action, or object in the picture, the program approximates the size of it, making a box by finding the leftmost, topmost, rightmost, and bottommost pixels in the object of action and remembers these points for comparison in the future. Thus since each picture in the database is saved with this information, it can translate each picture directly to the active site.
Since active spots in one picture might be bigger in one picture than another, when making a comparison to another picture in the database, it finds the ratio of the size of one active site to another. and compares proportionately. For example if the sample is 2 times as tall and 3 times as wide as the picture in the database it's comparing to, it compares a 2x3 rectangle in the sample to a single pixel in the database. For simplicity, we would take the average color in said 2x3 rectangle for comparison.
When comparing pixels, it treats the RGB elements of the pixel as a 3 dimensional vector, and subracts one vector from another to find the color difference. We then take the length of this vector to enumerate the difference as something that can be compared. When comparing whole sections, I keep a sum of the difference enumerizations for the whole picture. This scheme is imperfect, as it is possible to trick the algorithm with certain perfectly symmetrical combinations, but keeping a several hundred dimension vector of differences at each pixel for two different image comparisons, and then finding the length of the vector of their difference to enumerate the difference is not so fun.
For finding a match, I do a simple threshold match. The sample picture would have to have a certain difference from the database picture to be considered a match. If there is no match, the picture is not recognized and is added to the database. Pretty simple scheme here.
The end result is a program which takes a picture to analyze as an argument and uses the color distribution to compare it to pictures it knows, finds the differences and determines if it is a close enough match to any of them. Although this is pretty much the algorithm discussed in class, it seemed like the best way to do it.

Results:
First of all, active sections of the picture are surprisingly well recognized. Further, pictures of me are recognized, but certain conditions must be met. Since the end result of the algorithm is creating color histograms, a significant change in lighting completely breaks the algorithm. This is why, at first, when performing tests in my dark room when the difference in overall color variety was not so high, the algorithm had a significantly higher chance of failing. Once I moved to a more well-lit area, it worked much better since shadows and light angles made much less of a difference.
I fist tested this against people who had a significant difference in color. The algorithm was pretty good at differentiating me from my indian friend, and my friend with blonde hair. Compared to my friend who had similar color hair as me, but a beard this also worked fairly well.
Overall, although the program is not perfect and is sensitive to lighting conditions, it did a satisfactory job.

Obstacles:
Math! I had a weird time finding the best ways to enumerate differences between pictures, and I am still not satisfied with what I decided on. This is probably a large field of research though, so I don't feel too bad.
Also, I first started trying to write this program to analize .jpg images, but I found them to be far to complicated to decompress and get the data from. I settled on using 24-bit color bitmap files since the data is easy to analize and manipulate. It is simply a header followed by 8 bits of red, 8 bits of green, 8 bits of blue for each pixel therein.
I refrained from actually saving the analized active site as a new, smaller picture since I found I could not correctly create the bitmap header needed for the new picture, so I decided it wasn't worth the time to debug. Thus I just save the information for finding it within the picture itself. Note that although this is a rectangular region, after the calibration process, non-active pixels are set to 0, so they contribute nothing towards enumerating differences.

Todo:
I think that a better difference enumeration algorithm would have been great. Also a difference algorithm that is sensitive to location AND color, as opposed to mine that is just sensitive to color with some consideration to location, would improve my program at least tenfold.
I also apologize for using the algorithm presented in class, I don't know much about this sort of image manipulation and found some of the suggestions very helpful. So then, another way to improve the program is to make the algorithm more original.

Samples:
Here are some better samples of what my program generates for different faces, you can see it is imperfect, but overall well behaved.
A further comparison of both processed images then accurately results that they are not the same person.
Some data is lost however, since my friend reflects light a lot and part of his face ends up with the same color configuration as the background...


Sunday, February 17, 2008

Project 2

by Luis Daniel Ballesteros

Goal:
The goal of this project was to develop an algorithm for the RS Media to learn to perform some action. After analyzing the resources available for doing this, due to the awkward positions and movements of the robot's arms and body, I decided that throwing a batt at a target was the best way to go. Thus the goal of this project was to see effective learning in the context of changing behavior to result in a ball hitting a specific target.

Algorithm:
The algorithm I used for this was a mix of simple and complex. The feedback that the robot recieved from me regarding its performance was a simple too far or not far enough signal. Thus the value analysis function was pretty simple, but changed representation internally depending on the context.

To alter its behavior to perform better, the robot had a set of actions that it could alter. The base movement for the throw was a combination of swinging the wait, lifting the arm, and twisting the wrist. Thus the variables that could be changed in this system were speed and distance of waist turning, speed and height of arm, initially, and during the throw, and finally, when to twist the wrist. The combination of these variables created a complex action space that the robot could explore.

Upon initial failure, the robot would, if too far, or too close, adjust one unit in each of the ways it could change, making the throw one unit closer or farther on each trial. If after a full unit change in one direction, it recieved a signal to change in the other direction, it would not do a full unit's change. This is because, it might have first not been far enough, but then after analysis and change, it was too far. Going all the way back to where it started since the last change woul put it infinitely going too far and not far enough. Thus, in this case, it would adjust only one variable that it saw fit in the other direction. Consecutive feedback of this same direction would toggle which variables and combinations were changed. So for instance, if it went too far, maybe it wouldnt lift its arm as high next time, and if that was too far, it reset the arm variable and twisted the wrist sooner. It would eventually try a combination of lower arm and earlier wrist, and so on until it reached one full unit in backwards direction, at which point it would reset the incremental changes back to full unit changes. This way, if you move the target, it can make full unit changes to find it again.

Thus this algorithm was just a complex game of hot and cold. It received feedback from its left and right feet for too close or too far, and you could hit one of the back triggers on the feet to keep the movement the same, or quit the demonstration. However, learning is not saved across sessions, so turning off the robot or quitting require relearning everything all over again when it starts up.

Results:
I was surprised by the results of this algorithm in the fact that it was actually quite effective, though it did take a few throws to get it just right. I enjoyed the fact that I could move the cup elsewhere and the robot would be able to find it again.

The only dissapointment I had with the results was the fact that the robot couldn't throw very far, so this was more an excersize of precisely dropping the ball with a bit of speed than it really was of throwing it. Other than that, this was a success

Obstacles:
A very sad part of this hole project is that the entire throwing mechanic is very hacked together from what little control is available. For instance, the library has no functions for direct control of the hand. All functions that do something with the hand trigger long sequences and movements, like reaching out and requesting an object, or defieantly making a motion to drop the object. There were no functions that could be invoked during other movements since they all initiated their own complete macros for the whole robot. That being the case, I could not have the robot actually grab the ball, but luckily enough, the ball fit on top of his hand, in between the thumb and forefinger.

At this point, all I had to do was get the palm in the right orientation so that the ball could fit in this groove. Unfortunately, again, there are no functions which directly control movement of the wrist...I found only 2. One twists the wrist all the way up, the other twists it all the way down. Thus to prepare the ball in his hand, I had to have him twist it all the way up, and then physically move his wrist to a level position and place the ball on top of his hand. Then, when he threw it, I simply had him spin his wrist back down and flick it off of the toop while it was moving. Combined with waist and arm movement, this made somewhat of a throw...

In the end though, needing to physically level the hand smudged the data and the learning a bit since this step was not guaranteed to be the same, thus it added a level of randomness and uncertainty to the learning. Some may see this as a good thing, though...

Experience:
Overall, I once again had fun having the robot do something interesting. I enjoyed showing my friends what he could do and was even better at the game than some other people at times... I think both the robot and I learned something from this project.

Todo:
It would have been nice to have some mechanic for gauging by how much he missed and what direction, rather than a just too far or not far enough gauge. This would have allowed the robot to have a better value function in the algorithm and thus be able to learn more efficiently.

Links:
Here is a video of the robot in action. He is talented!

Sunday, February 3, 2008

Humanoids Project 1

by Luis Daniel Ballesteros

Goal:
The initial idea behind this project is to make the robot dance. Of course, simply executing a series of moves is boring. I wanted the robot to learn to dance, so I devised a simple probabilistic learning algorithm. The goal of my project, then, is to see results indicative of learning in the robot's dances.

Algorithm:
The algorithm is simple, yet complicated. As units of movement, I copied all of the preset move functions from the sample dances and numbered them move0 through move120. I then wrote about 30 of my own moves, combinations of movements that seemed interesting to add to the pool, for a total of 151 possible moves. Now all the robot needed was a way of invoking the moves that are best fit for the song playing.

To chose moves, I set up a model where good moves had a higher probability of coming up than bad moves. The way I did this was giving each move a score from 0 to 99. To chose a move, I do 100 samples numbered 0 to 99. For each sample, it randomly chooses a move whose score is at least as high as the sample number and gives that move a point. Thus a move with a score of 30 has fewer potential points than one with a score of 80. At the end of 100 samples, the move with the most points is executed.

At the end of the dance the robot takes input from the user to judge how good or bad the dance was. If the dance was good, all moved performed will receive a score bonus. If it was bad, all moves performed will be reduced score. The user can also choose to vote neutrally and not change the scores.

Also note that no move ever has 0 probability since not only do I not allow repeat moves in a dance, but even moves with score 0 are considered in the draw. This is good to revive moves that may have seemed bad in context, but are actually not that bad.

Results:
I was personally surprised with the results of the learning. In the beginning, every move had a score of 50, so they all had an even chance to be chosen. However, after a few runs, it became clear that the moves I liked were happening more frequently and the ones I didn't were not. For example, some of the moves were simply sleep instructions that caused the robot to stop moving for a bit, and some involved moving his wait. I really disliked it when he stopped, and really enjoyed it when he moved his waist. I could see that as I judged and reviewed his performance, the pauses almost went away and the waist moves were in every dance.

I also noticed that the more the robot learned, the harsher I had to be with my judging. After the point where it almost never called the wait move, I had to look for other things that I liked or disliked. It seems clear that the more dances you judge, the better they get overall. But as you will see below, doing many trails was difficult, long, and cumbersome. Generally I like the dance that robosapien does on average, though with a lot more trials I could effectively weed out all of the bad moves.

Although some move's probablilies went up into the 70s, and others into the 30s, since there are so many moves, the majority of them linger in the 40s and 50s. With a lot more trials I believe that thiss method would effectively seperate the bad moves from the good moves.
However, if you watch the videos, you will see that he does a lot of moves that move his waist, as per my taste in his dances. It's pretty exciting.

Obstacles:
Although there were many things the robot could do, the things that it couldn't were severe obstacles. For example, I could not find a way to create files inside the SD card, so I had no automatic way of saving the learning done in one session. I had to use a hack where calls to System.print are redirected into a log file that has debugging information, I would print the array of saved values to the log file and then copy it as a hard coded initial value inside the code, recompile, and copy the file over to the SD card again. This became very frustrating. If there is a way to create and write to files I would much appreciate it. The java mobile libraries don't include the functions necessary to do this that normal java has, so...

Also, I don't know why, nor do i know whose fault it is, but the dance would always fail on the 5th try. this means that from bootup, I could only do 4 dances before I had to turn it off and copy the saved learning and recompile. Worse yet, sometimes the log file would not be created so I would lose the data I had spent time learning. I never could figure out why the robot stopped working on the 5th dance because I had no way to do runtime debugging, so I never knew the state of the robot when it failed. No, I could not simply write the status information to the log file because whenever the robot froze, all I could do was shut the power off and the log file would never be written. I suspect it is a stack overflow or something of that sort, but I have no way to test anything.

Combined with all of these frustrations and obstacles, this project took me more than twice as long as it should have. Better documentation of things like how the log file is created would have been of utmost utility here. Not that I blame anyone for it, we're all still getting used to this new platform.

Experience:
Aside from the extreme frustration of the programming environment, using java, and losing my data frequently, this project was a lot of fun. I enjoyed watching the robot dance and it was cool to see the results of my learning. Overall i am glad that I did this project as I learned a lot of things from it. It was awesome to make all my friends burst out in laughter when they watched the robot dance. It really is quite entertaining. Furthermore, it's exciting to make him dance multiple times because the dance is never the same, so it's interesting to see what will come up based on what he's learned.

Todo:
It would be cool to install another learning algorithm that he not only learned which moves are best, but what order to do them in. Sometimes two moves can look individually bad when done in the wrong order, but in the right order they look good. This would be good to consider, but harder to implement and test. A good idea for version 2 if it ever gets to it.

Links:
Youtube video -
This is a quick run:



In this second one you see me press his right foot at the end of the dance. This is telling him that I liked the dance he did. I figured I should show this since I abruptly stop filming in the previous video.

Java source code - Link
A note on the java - I am primarily a C coder, and the lack of function pointer made my life very difficult here. do not laugh at my switch statement with 150 cases.... If you know how to do variable method calls in java, please leave a comment, I would love to know. I hated having to write a C program to write a giant java switch statement....