After changing some elements of the experiment, I got the actors behaving in a way closer to what I wanted:
I reestructured the inputs, the sensorial information, that each of the neural networks received. I thought that including so many values that only held the information about where some fruit was located, even if they included the notion of the cardinal direction where it was, destroyed the balance with the rest of the inputs. So I reduced them to the following:
- A normalized value, from 0.0 to 1.0, that represents each turtle’s health
- A normalized value, from 0.0 to 1.0, that represents how close is the closest fruit
- A value of 1.0 if a turtle has another one right next, and 0.0 otherwise
Although I couldn’t think of an obvious way the final input would affect the behavior, it was information present in the simulation, and part of a neural network’s job consists in not using the information that doesn’t help it achieve its objective.
I also added an output: if it received the maximum value, the actor would walk a tile in a random direction. As the video shows, in a few generations those turtles that received the highest values for that single output ended up reproducing more, because moving through the map got them closer to the fruit. They dominated so much that I reduced the amount of fruit present at any given moment, to make sure they weren’t just walking over it randomly. Many of the members of many generations gravitate towards the fruit; after all, the inputs include a measure of how close the closest piece is. I don’t know if the information of whether each turtle had another one right next to them affected anything.
The experiment went well enough for me, and I moved on to more interesting ones.