Classifying Dogs’ Activities
What we wanted to do
In addition to providing general activity information, we wanted to provide pet owners with insights into the type of activities their dogs perform (e.g. playing, walking, resting, etc.). By classifying specific activities, Whistle enables pet owners to:
a) Access a journal that automatically logs their dog's daily activities.
b) Track their dog's activities while they are away.
c) Understand how the activities they do with their dog (walks, play time, etc.) influence their dog's overall health and well being.
In order to accurately classify individual activities, we had to go beyond the data processing methods used by most activity tracking devices and use machine learning methodologies. Machine learning is a well-researched and rapidly expanding field; however, it hasn't yet been used to classify dogs' daily activities.
How we did it
The starting point for this classification process comes from a series of 3-D acceleration measurements we acquire 50 times per second. In order to make the classification feasible, the measurements are summarized using a series of numbers that are more related to the dog's activity than to other influencing factors such as the orientation of the Whistle device or the size of the dog. In machine learning parlance, each number in this summary is typically referred to as a “data feature” and all of the numbers together are called a "feature vector".
To calculate each of the data features, the sensor measurements that we get are divided into a series of relatively short blocks of data, so that each block is mostly made up of a single type of activity. For each of these blocks, our feature vector is made up of over 20 different data features from both the time and frequency domains.
As an example of how these data features are related back to different types of dog activities, consider a relatively simple data feature, the standard deviation. The standard deviation is a measure of the variation of the acceleration measurements about the mean. The graph below shows the standard deviation for an accelerometer during a trip to the park to play fetch.
How is the standard deviation related to a dog's activities?
When the dog is resting, the sensor hardly moves at all, which results in a very small standard deviation.
When the dog is walking, the sensor registers each of the dog's steps, so the standard deviation goes up slightly.
When the dog plays fetch, all the running and jumping results in a much larger standard deviation, but note that play time also includes lulls in the data when the dog is waiting for the ball to be thrown.
Based on these results, it may seem like activities could be classified just using the standard deviation data feature. However, activities other than running and jumping (e.g. scratching or shaking) can also result in a large standard deviation, and would result in misclassifications. By using a variety of data features in addition to the standard deviation, the relationships between multiple features can often be used to distinguish between these different types of events.
While we might (with enough caffeine and sleepless nights) be able to manually derive some basic rules to determine the type of of activity a dog performed based on a few data features (for simple activities of course), it is much faster, simpler, and more accurate to automatically determine these rules using supervised learning. This machine learning method uses previous examples of each type of activity to determine what type of activity the dog is currently doing. In other words, the computer builds a model of what the feature vector “looks like” when the dog was walking or playing and classifies new data by comparing the new feature vector to each of these models and seeing which model it is most similar to.
To obtain the example data we needed to build these supervised learning models, we needed lots of help from our friends and families as well as the Humane Society of Silicon Valley. These volunteers helped us collect lots of data from dogs doing different activities. Using these data, we generated a large number of feature vectors from each of the activities, which we used to train support vector machines (SVMs) to classify each block of data as a specific type of activity. By applying the trained SVMs to a feature vector generated from a new block of data, we predict what type of activity the dog was doing when those data were acquired. We then aggregate the classifications from each of these small blocks of data into events so that we can help pet owners keep track of what their dog actually did throughout the day instead of just how many “points” they accumulated.
Whistle enables you to automatically keep track of your dog's activities and gives you further insight into your dog's life by creating and classifying events using machine learning.
While we have worked hard to develop this methodology so we can provide pet owners with an accurate journal of their dog’s activities, the classification process also allows us to integrate additional data so that we can continually improve our algorithms. As we continue to capture more data across various breeds, ages and weights, we will be able to provide users with insights into how their dogs compare to others for the very first time. In addition to refining our current classifications, we also plan to extend our algorithms to classify different types of activities.
May 31, 2013