Creating Collar-sensed Motion Gestures for Dog-Human Communication in Service Applications

Creating Collar-sensed Motion Gestures for Dog-Human Communication in Service Applications Giancarlo Valentin Joelle Alcaidinho giancarlo, joelle@gatech.edu Ayanna Howard ayanna.howard @ece.gatech.edu Melody M. Jackson Thad Starner melody, thad@gatech.edu ABSTRACT Working dogs are dogs with one or more specific skills that enable them to perform essential tasks for humans. In this paper we examined motion gestures that working dogs could use to unambiguously communicate with their human companions. We analyzed these gestures in terms of true positives and propensity for false positives by comparing their dynamic time warping distances against a set of everyday gesture libraries (EGL) representing their daily movements. We found four gestures that could be concretely defined, trained, and recognized. These gestures were recognized with 75-100% accuracy, and their false positive rate averaged to less than one per hour. ACM Classification Keywords H.5.m. Information Interfaces and Presentation (e.g. HCI): Miscellaneous; Author Keywords Wearable technology; Animal-Computer Interaction; Gesture recognition INTRODUCTION We define working dogs as canines with one or more specific skills that enable them to perform essential tasks for humans. Working dogs that assist humans with disabilities are called assistance dogs. Other working dog occupations include field work, such as search and rescue (SAR) or explosive-detection. The roles of working dogs continue to evolve alongside, and sometimes with, technology. Just like advances in semiconductors have led to increased capabilities in computing, sensing and automation, so have advances in the field of canine cognition augmented the possibilities for dog-human partnerships. Working dog roles rely on the ability of dogs to sense the environment in great detail. This ability can be augmented with occupation-specific training. For example, guide dogs, dogs that assist the visually impaired, can distinguish between a wait" obstacle (e.g. car) or go-around" obstacle (e.g. trashcan) [17]. Explosive-detection dogs can categorize explosives based on chemical characteristics, most notably between stable" or unstable" compounds [8]. Unfortunately, user interviews [23] suggest the information perceived by working dogs often exceeds their ability to communicate it to humans. We previously classified these barriers to communication into three categories [23]. Perceptual barriers are a result of dogs needing to communicate something they can sense but their human companions cannot. This barrier might be the result of a person s disability (e.g. visual impairment) or a human sensory limitation compared to canines (e.g. olfaction). Because human senses cannot perceive what the dog is sensing, the information must be communicated explicitly through the remaining available channels. Distance barriers are present, for example, in canineaided search and rescue, where the dogs communication signals (i.e. barking, positioning, etc.) might be ineffective at distances beyond line of sight or hearing. Contextual barriers are present when medical alert dogs must notify humans other than their companions that there is an emergency. Because their signaling behaviors are often only understood by their companions, the alert can be misinterpreted or ignored, possibly delaying medical attention. Despite these barriers, dogs can use specific signs for communication. Two types of signs discussed in semiotics [16] are shown below (Figure 1). Indexical signs can represent the information being conveyed while symbolic signs can represent any type of information. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. ISWC 16, September 12-16, 2016, Heidelberg, Germany 2016 ACM. ISBN 978-1-4503-4460-9/16/09... $15.00 DOI: http://dx.doi.org/10.1145/2971763.2971788 Figure 1. The focus of the present study is to find movements suitable for symbolic communication. Rightmost image reprinted with permission from Yellow Neener Photography. 100

To reduce communication barriers between canines and humans, we propose training dogs to perform gestures as a means to communicate situations they have been trained to recognize. We believe these gestures can be recognized based on inertial sensors on the collar. We chose the collar placement because canine communication gestures frequently include head movements [13] and collars represents little additional overhead in terms of equipment worn by the dog. This consideration is important because service dog harnesses vary by organization; police dogs have heavy harnesses already, and search and rescue dogs often wear no harnesses at all. Related Work The dog at the keyboard project was one of the first documented efforts aimed at symbolic dog-to-human communication [21]. A modified toy keyboard provided dogs the means to make requests by pressing keys that produced sounds. Findings suggested that dogs may be able to learn a conventional system of signs associated to specific objects and activities. Search and rescue dogs are also known to communicate symbolically to accomplish important tasks. In some jurisdictions they bite brightly colored cylindrical objects called bringsels to communicate a find within line of sight [6]. More recently, the FIDO project has explored wearable input devices that dogs can activate by biting, tugging or performing a nose-touch as a form of communication [12]. In terms of our methods, we distinguish between gesture and activity recognition, with the former being closer to our goal. Rather than gestures for individual applications (e.g., gaming), we rely heavily on Ashbrook s work on micro-interactions and its emphasis on always-on gestures where false positives are to be avoided [2]. In the realm of canines, inertial sensing has focused mostly on posture and activity recognition. One early effort attempted to detect posture in urban search and rescue dogs [19, 20]. The postures were sitting, walking, standing and lying. Here, a rule-based algorithm achieved an accuracy of 76%. A subsequent effort attempted to detect postures as part of an automated training system [4, 5]. A more recent effort by researchers at Newcastle University [13] monitored activities that correspond to healthy behavior traits of pet dogs. Seventeen activities were detected by an offline PCA-based algorithm with an accuracy of 76%. A similar study was conducted at Eötvös Lorand University with an offline support vector machine that classified lying down, sit, gallop and canter with more than 80% accuracy for subject independent testing [9]. Previous systems have successfully addressed the detection of existing behaviors of daily living. In contrast, the scope of this project does not include detecting activities or emotional state. We focus exclusively on creating and detecting new gestures for symbolic communication. Some of the peculiarities that differentiate this problem from traditional work in human activity or gesture recognition highlighted by Valentin et al. [22] are the following: 1. Users are unable to annotate their own data. 2. Users are unable to reposition the sensor if dislodged. 3. Movements are non-periodic and short in duration. 4. Unlike humans, dogs are not expected to modify their behavior to increase true positives or decrease false positives. 5. The gestures must be taught to the participants without verbal descriptors. METHOD Gesture creation, selection and training requisites Before settling on the collar-based system, we attached the sensor to the front of a harness [22]. This placement was unaffected by the regular rotations on the collar. The behaviors detectable from this placement were activities (lie down, run, jump) and changes in postures (e.g. sit from stand). It became evident that these everyday behaviors could not be re-purposed as communication gestures, because by definition they happened regularly. Once we settled on the collar placement we considered the potential neck or body movements we could use to form gestures. These were moving the neck left, right, up or down. Additionally, we also considered rotations of the sensor along these dimensions. For example, roll (as in flight dynamics or roll over), spin (a 360 clockwise rotation on yaw axis) and twirl (a 360 counterclockwise rotation on yaw axis). Note that a similar movement on the x (pitch) axis would amount to a back flip, which we did not consider. Although some preliminary results showed that roll could be identified by detecting a sign inversion of the gravity axis, even well-trained dogs seemed uncomfortable rolling on the bare floor and it was not clear that working dogs wearing a leash or harness would be able to roll on a given surface. Vertical gestures (looking up and down) were difficult to train because there were few analogues to intentionally moving in this way. As dogs walk, their heads move up and down many times per minute. To distinguish an intentional vertical gesture from the act of walking would require precise training that would be a significant time investment. Finally, when training a dog, eye contact with the trainer is extremely important and keeping their heads still until a cue is given would involve an additional training task. As a last resort, we used a target on the floor during training but this approach led to the impression that the desired action was to touch the floor and wait until rewarded, rather than a nod-like gesture. None of these obstacles are individually insurmountable, however, collectively they were sufficient to postpone further exploration of these gestures until a later time. We summarized the learnings from our explorations as seven criteria necessary for suitable gestures [23]. The gestures that remained were right reach, left reach, spin and twirl. Reaching left and right were inspired by activation gestures used to trigger wearable interfaces [12]. Spin and twirl are behaviors already taught to some dogs, but not performed regularly enough to be discarded at this stage. System and Equipment In addition to the instrumented dog vest (Figure 2), the main piece of equipment used for this study was a commercially 101

SESSION: ACTIVITY RECOGNITION AND SENSING Candidate Gestures Spin Twirl Right reach Left reach Description of gestures Clockwise rotation of 360 Counterclockwise rotation of 360 Reaches to right ribcage and return Reaches to left ribcage and return Table 1. The four gestures we examined in this study. available inertial sensor, the Shimmer 3 [11], by Shimmer Sensing, Inc. This unit consists of a 9-axis sensor, including a three axis accelerometer, gyroscope and magnetometer. The sampling frequency was set to 51.2 Hz. Instead, we had to ensure dogs could learn to offer these gestures after being given a visual or verbal cue. Our first participant had no previous experience with wearable activation interfaces and would not offer actions like reach left or reach right spontaneously. Our second participant had experience with the gestures but performed them in broad undirected ways when not wearing a wearable activation interface because of the lack of a precise target. We realized that even though a gestural system did not require a dedicated affordance for each message [12], it was still necessary to have some type of target while the dog was still learning and until our recognizer could provide feedback on correct completion. Although we experimented with auditory feedback (throughout or upon completing a gesture), we discovered that using a simple vest-based two-target system was enough to obtain the precision we required (Figure 3). Once the gesture was trained, the dog would no longer require the vest. Figure 2. Detail of the instrumented Julius K9 vest with targets that made the gestures more precise. The Shimmer3 was attached to a custom collar that was placed loosely on the neck. We selected the Shimmer 3 due to its light weight and small size (51mm x 34mm x 14mm) compared to sensors with similar capabilities. Considerations of weight are extremely important because heavy objects might obstruct the intended movements. The weight of the Shimmer 3, 28.3 grams, is significantly below the maximum weight guideline of 4% of the animal s body weight for wearable devices used in animalcomputer interaction research [24, 25]. Although the sensor was allowed to rotate, it settled on the bottom of the neck due to gravity. Finally, we used a two-pocket vest on which to store a mobile phone recording data wirelessly for longer-term recording, to match our target usage scenario. Protocol and participants Training method for offering selected gestures At the beginning of our method section we showed how we attempted to capture movements that were ultimately discarded. We now describe how we prompted the gestures we further examined on a dog with no previous experience performing gestures on cue. To avoid training a dog to perform a gesture that would ultimately be undetectable from an everyday movement, we first tried to lure" them into performing each candidate gesture. Luring is a technique by which dogs follow a target object (e.g. treat or toy) to perform an action [1, 23]. We ultimately realized that the readings from a lured action are more representative of the trainer s luring technique than of the dog s performance, and hence could not be used as a stand-in for the dog performing the gesture on his own. Still, luring was valuable to use in training the dog, but we did not record these instances as examples. Figure 3. One participant performs a twirl and left reach. The vest consisted of two bright colored targets on each side. We built the targets out of bright yellow 3.81 cm (1.5 in) diameter balls to make them easier for the dogs to see [14]. Originally, we had used a dark object against a dark background (a vest) as a marker, but that was harder to locate as it did not provide enough visual contrast for the dogs [15]. Participants For this study, one human trained two dogs using positive reinforcement. One dog, a two-year-old retriever (S1) with assistance training, had no experience with wearable interfaces and was trained to perform them exclusively for this experiment. A 7-year old Border Collie (S2), had three years of experience with wearable interfaces. In general we selected participants based on the following criteria: Availability of the dogs and their human companion Proximity to the testing location Ability to participate without compromising training Training protocol When a dog performed (or attempted) a gesture they would receive a food reward (1 cm sized treat) [10]. As the dog demonstrated that he could perform the gesture correctly at least 65% of the times we asked, the reinforcement schedule was decreased to only deliver a treat for a correctly performed gesture [3]. Throughout this process the dog also received immediate feedback through the sound of a clicker[18]. For training rotational gestures (spin and 102

twirl) we relied on luring at the early training stage before transitioning to a subtle hand signal and verbal cue. Each training session lasted no more than 10 minutes. The average learning time for each gesture varied depending on the dog s prior training experience, but did not exceed more than 15 training sessions per gesture. Gestures were trained both off-leash and on-leash with the least experienced dog. The most experienced dog was trained off-leash but showed no observable difference performing the gestures on-leash as long as the leash was long enough to avoid interference. As a result of his experience, we obtained the necessary gesture performance from S2 after two practice sessions. We used pronounced hand signals to illustrate the movement for the spin and twirl gestures. Participant S1 was trained intermittently, for almost two months, until he could perform the gestures from a single verbal cue or minimal hand signals. No food, sensory, or water deprivation was used at any point. Data Analysis General approach and challenges We originally approached our goal as a traditional classification problem. We recorded trainers prompting dogs to perform a given set of gestures, often through luring, for use as training data during supervised learning. Because not every dog could perform the same set of gestures there were insufficient examples to properly train a statistical classifier. Even before we could use such a classifier, we would require a way to segment gestures from other motions. Training either model required a human labeling each example, which was particularly difficult because there were no universally agreed upon definition of where the gesture started and ended. In humans, this requirement is met by enforcing a specific definition on the subject performing the gesture ( you must do it like this, otherwise it will not count. ) However, determining the equivalent method for dogs would take much longer. Another issue is that dogs in training tend to maintain eye contact with their trainers, which creates a different head orientation than a dog performing a gesture autonomously. The impact of these issues could be minimized with substantial training but as we described earlier, the availability and energy required to train our participants made it undesirable to train gestures for months only to realize they would be undetectable from everyday movement. To avoid this vicious cycle we decided to train gestures for which we could do the following three items as needed: train all of our previous participants verify gesture completion without the sensor arbitrarily increase the precision and consistency We did no explicit labeling other than storing data containing an example in an individual file, but left the definition of boundaries undetermined. The only definition we attempted to enforce was that each gesture would be preceded and followed by inertial silence, but abandoned this rule as well when, despite our best efforts, none of the recorded examples satisfied this condition. Finally, we decided to compare single examples of these gestures against data streams containing everyday movements (so called Everyday Gesture Libraries) to gauge their viability as in Ashbrook et al. [2] and also against data sets containing examples of the candidate gestures. In this way a false positive could be relative to the threshold of true positives. Otherwise, an arbitrary unachievable distance threshold could be set to bring false positives down to zero. Our hope was that even if these basic gestures proved unsuitable, the reason for their failure would not prevent them from becoming the building blocks of successful ones. In this way, any time spent training would not be in vain. Orientation correction Unlike a human wrist-watch used for gesture detection [2], our inertial motion sensor is attached to the collar and its position at any given time is not constant. As a result, the readings, which are based on an internal reference frame, might not be the same for similar actions. To decrease collar movement, we tried more than five collars (some commercial and some custom-made), tried superimposing two on top of each other, and even resorted to clipping the collar with hair clips, all to no avail. Because canine skin around the neck is inherently loose, and canine bodies are covered in thick hair, it became close to impossible to prevent the sensor from moving. More importantly, even in cases where the sensor has not moved, the angle of the dog s head might vary for the same gesture, for example, when maintaining eye contact with the human. The most effective way to address these issues is to transform the coordinate system of gyroscope measurements into an external reference frame relative to earth. This new reference frame is thus based on the direction of gravity sensed at any given time. We denote the output of a three-dimensional accelerometer as a total = [a x,a y,a z ] T. This output contains two components, namely: a total = a linear + a gravity. Next, we represent magnetometer readings as m = [M x,m y,m z ]. Our first and second dimensions are given by h 1 = a gravity m and h 2 = m. We then normalize each of them to obtain unit vectors of direction. Finally, we have all the necessary components to create a rotation matrix. This matrix is applied to each individual reading that comes into the system. a T gravity m T total h T = ˆ ˆ ˆ a gx m x h x aˆ gy mˆ y ˆ h y aˆ gz mˆ z ˆ Obtaining gravity from acceleration It was crucial to prevent the dog s movement at any given time from influencing the calculation of the gravity vector for the correction described above. The most common method to h z 103

SESSION: ACTIVITY RECOGNITION AND SENSING separate linear acceleration from gravity is to use a low-pass filter. In our case this method is also suitable, although some modifications were necessary. For example, when comparing a single gesture (query) against a larger data set (database), each must be corrected for orientation before they are compared. Unfortunately, the acceleration readings in the query segment include more of the dog s movement than the static gravity readings needed to correct for orientation. For this reason, if the acceleration value at a given point surpassed g 0 (9.8 m/s 2 ), it was not used in the calculation of gravity (Figure 4). We ultimately obtained the best results observing the variance of the L 2 norm, Var( signal i:i+n ), of a window of n samples (n=40) for a signal sampled at 51.2 Hz (Figure 5). { α gravityi 1 + (1 α) acc gravity = i, acc i <= g 0 gravity i 1, acc i ) > g 0 (1) The smoothing constant was determined to be α = 0.9999, acc i denotes the raw input at sample i and gravity est denotes the filtered output. Figure 4. Comparison of a raw acceleration signal of a single gesture and the resulting gravity vector over time. We used these values to correct the orientation on gyroscope readings. For verification, we implemented this correction in an on-line system and observed that the yaw movements relative to earth showed on the same axes regardless of the sensor s orientation. The benefit of this correction on recognition depends on how much the collar is sliding without the dog moving. Even with minimal sliding, the correction allows us to observe a similar signal regardless of whether the head is raised or parallel to the torso. Variance of three-axis norm We used an energy-based approach to segment events of interest from continuous streams. For the event-driven approach used in this experiment, the segmentation criteria is perhaps the most important aspect. The simplest segmentation approach detected a gesture start when a given intensity threshold was met and a gesture stop when the current reading fell below that predetermined value. The main drawback with approach was that zero crossings fell below every threshold. As a result, signals with zerocrossings were interpreted as multiple segments rather than one. Although this approach can be a benefit in some contexts, it was not suitable for our purposes. Figure 5. Variance norm in green overlaid over one amplified dimension of the raw signal for comparison. Event segmentation From an output like the variance-norm vector, a segmentation scheme determined the start and end boundaries of an event. At first, we used a strong threshold that only detected highenergy gestures and lost the initial portion of the gesture whose intensity was lower. When this proved insufficient, we used two sets of thresholds on var( gyro[i] ), one that detected the presence of a gesture (T detect ) and two weaker thresholds (T start,t end ) to determine the gesture boundaries. This step was crucial, because even a small error in boundary detection dramatically affected the recognition results. We also put constraints on the length of the gesture (number of samples), intensity of the variance norm and, for spin or twirl gestures, on the angle traveled. The angle was computed by θ = ω dt, where ω is angular velocity at a given time. Because our start and stop criteria allowed for some noise at the beginning and end of a gesture, the angles rarely summed up to 360. As a result, the only way to distinguish some false positives from left and right reach was through a threshold level that excluded spin and twirl. For this reason, we had to resort to two thresholds T detect rot and T detect reach. Condition Start Reach Rotation End T start > 1,600 T detect reach >11,000 samples>60 samples<140 T detect rot >4,000 θ>140 θ<390 samples>45 samples<200 T end <1,500 Table 2. Segmentation criteria determined empirically for gestures sampled at 51.2 Hz. The theta denotes the angle of movement. For example, an ideal spin would be 360. Based on the types of gestures we were studying, we only used the y (roll) and z (yaw) readings of the gyroscope such that the head orientation upwards or downwards while doing the gesture was not a concern. Distance metric We employed dynamic time warping to compare signals against templates. The resulting distance was divided by the sum of the length of each signal to account for the fact that longer sequences have more chances to diverge. We tried 104

to impose a locality constraint of w=10 to avoid pathological warpings. It had no net effect because comparing signals of different lengths imposed a constraint (warping = max(length(a b), w)) which always yielded length(a b). We empirically determined dtw.dist = 50 to be the threshold for identifying two gestures as examples the same class. We additionally tried to add several features over time, such as differences between each pair of points, displacement up to a certain point, but these did not provide any benefits in minimizing distance at this stage. Tuning parameters To verify the event segmentation, the distance metric, and obtain the parameters listed above, the first step was to collect small data sets of each gesture. These typically contained four gestures at fixed intervals. These data sets were compared against two templates of each individual gesture (Figure 6). Figure 7. Example of result of a right and left reach detected in data set containing other movements. To evaluate both true positives and false positives at once, we collected data over longer periods of time (25-50 minutes) where the dog would perform everyday movements such as those in walking, running, playing, lying down, drinking water but would also perform each gesture at fixed intervals. We refer to these sets as interval EGL (iegl) in the results section. RESULTS We performed the same test used for tuning on each of the four remaining untrained sets. That is, there was a single recognizer applied to data from both dogs (the threshold was not adjusted for each dog). Two on the remaining iegls for true positives and two on 5 hour EGLs containing everyday movements (Table 3). We also tested S1 s data against iegl streams corresponding to S2. Our results showed that the gestures with lowest distances were the correct ones, but they did not meet the minimum distance threshold (dtw.dist < 50) Figure 6. Example of individual gestures in isolation used for comparison against streams of data containing these same gestures. The expected result was that during a gesture (e.g. right reach), its stored templates resulted in the smallest distance while other gestures (left reach, spin, twirl) resulted in larger distances (less similar). We evaluated each training data stream against two templates of the same dog performing each of four gestures (Figure 7a). Our event segmentation criteria then segmented out the area of interest (Figure 7b). Finally, the dynamic time warping distance was calculated for each gesture (Figure 7c). In this way we defined the true positive distance threshold. A segment below dtw.dist < 50 was assigned to the gesture having the smallest distance. In Figure 7c, the first motion was classified as right reach (red) while the second as classified as left reach (green). DISCUSSION The results of our experiment were very encouraging. There were no substitutions between any of the four gestures. Similarly, for all the gestures performed as part of the iegl there were no deletions. Part of the reason is that while tuning our parameters, extra emphasis was placed on recall because without it, no comparisons to false positives would be possible. In other words, even though our sample size of gestures performed (23) is not sufficient to justify broader conclusions, it is a bare minimum to provide a reference for comparing gestures against each other. There was one case of a false positive motion that we could not eliminate with the criteria described above. The dog turned his head while looking upwards (most likely at the trainer) and our system detected it as a left reach (Figure 8). The difference from a real left reach was that the z axis had significantly less movement, but we could not codify this requisite in a way that achieved rejection. We surmise a probabilistic classifier would be able to encode this separation with enough training examples. 105

SESSION: ACTIVITY RECOGNITION AND SENSING Dataset Minutes Subject Use Events False Pos FP/hr Precision True Pos Recall Accuracy iegl1 50 S1 Training 48 1 (left) 1 80% 4/4 100% 75% iegl2 25 S1 Training 47 0 0 100% 4/4 100% 100% iegl3 25 S2 Training 32 0 0 100% 5/5 100% 100% iegl4 25 S1 Testing 37 0 0 100% 6/6 100% 100% iegl5 25 S2 Testing 6 0 0 100% 4/4 100% 100% EGL1 305 S1 Testing 50 2 (spin, right) 0.4 EGL2 305 S2 Testing 18 0 0 Table 3. Summarized results for each data set. Some dogs offered gestures more than four times. squares of the data for computing variance. Our method took 1 to 1.5 seconds from the performance of the gesture until auditive feedback was provided. If due to this delay the dogs believed the gesture went undetected, they would repeat it. Figure 8. Example of a false positive in the training set. The dog turned his head while looking at the trainer, and it met all the conditions for a left reach. Most of the false positives detected for each gesture occurred due to the spontaneous repetition of the gesture requested (for left and right reach). For spin and twirl, we expected some false positives to occur because the dogs did perform an equivalent motion while playing. From this experience we found it useful to make the following distinction. Types of false positives We categorize false positives of two types, classificatory and behavioral. The first type are cases where gesture i looks likes gesture j to the identification algorithm. The second type refers to cases where one gesture turns out to be a behavior present during daily living. For example, suppose gesture i represents spinning (360 clockwise rotation). It might be that certain subjects perform this action spontaneously before lying down. Behavioral false positives cannot be eliminated except by redefining the gesture in a more specific way such that they can be distinguished from common occurrences. The behavioral false positives can be estimated by the human eye while classificatory ones depend on the classifier. Online system Throughout this offline design process we often implemented an on-line version of our system to verify its operation and incorporate new findings (e.g. orientation correction). The segmentation criteria is implemented with a running sum of Reflections on approach Time-series recognition requires classification based on multiple samples rather than individual ones. This grouping is often achieved via windowing, continuous hidden Markov models or event segmentation. This last approach is often described as a two-step process consisting of segmentation and classification [7]. Although the approach in the gestures toolkit we drew inspiration from can be interpreted as following this paradigm, its segmentation is meant to optimize run time by ignoring regions with few movements [2]. For example, the MAGIC toolkit does not rely on segmentation or classification specifically tuned to any one set of gestures. It is general enough to suit the purposes of building a toolkit suitable for any designer to create new gestures. This aspect was ultimately not suitable for our purposes because no single activation threshold was sufficient for dynamic time warping to reject every null class instance. Nonetheless, the process of recording data of everyday movements and performing the gesture design offline was extremely useful. CONCLUSION AND FUTURE WORK From our results, we have been able to understand the constraints and requirements of minimizing false positives in always-on gestures. We also solved some practical problems in this area. For example, we illustrated a method for correcting orientation on a dog collar, even for very short segments. Finally, we found four gestures that could be concretely defined, trained, and recognized in addition to developing a way to train them. These gestures were recognized with 75-100% accuracy and their false positive rate averaged to less than one per hour. Our next step will involve collecting more examples using our concrete definition (embodied in the two-target vest) to train a probabilistic classifier. Ideally, we want a completely subject independent system capable of running in an online fashion. For the design process, incorporating indexed video to accompany the inertial readings would allow us to observe what particular motion caused a false positive and accelerate the process significantly. ACKNOWLEDGMENTS This work was funded by the National Science Foundation under Grant IIS-1320690. We also received a seed grant from 106

the Georgia Tech GVU Center and the Wearable Computing Center. We would like to express our appreciation to the reviewers of our manuscript for their time and effort. REFERENCES 1. Michael Ben Alexander, Ted Friend, and Lore Haug. 2011. Obedience training effects on search dog performance. In Applied Animal Behaviour Science. Elsevier, 152 159. 2. Daniel Ashbrook and Thad Starner. 2010. MAGIC: a motion gesture design tool. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, CA, 2159 2168. 3. Mariana Bentosela, Gabriela Barrera, Adriana Jakovcevic, Angel M Elgier, and Alba E Mustaca. 2008. Effect of reinforcement, reinforcer omission and extinction on a communicative response in domestic dogs (Canis familiaris). In Behavioural Processes. Elsevier, Netherlands, 464 469. 4. Rita Brugarolas, David Roberts, Barbara Sherman, and Alper Bozkurt. 2012. Posture estimation for a canine machine interface based training system. In Engineering in Medicine and Biology Society. IEEE, 1. 5. Rita Brugarolas, David Roberts, Barbara Sherman, and Alper Bozkurt. 2013. Machine learning based posture estimation for a wireless canine machine interface. In Biomedical Wireless Technologies, Networks, and Sensing Systems (BioWireleSS), IEEE Topical Conference on. IEEE, New Jersey, 10 12. 6. Susan Bulanda. 2010. Ready! 2nd Edition The Training of the Search and Rescue Dog. Kennel Club Books, NY. 7. Andreas Bulling, Ulf Blanke, and Bernt Schiele. 2014. A tutorial on human activity recognition using body-worn inertial sensors. In ACM Computing Surveys. ACM, 33. 8. Kenneth G Furton and Lawrence J Myers. 2001. The scientific foundation and efficacy of the use of canines as chemical detectors for explosives. Talanta (2001). 9. Linda Gerencsér, Gábor Vásárhelyi, Máté Nagy, Tamas Vicsek, and Adam Miklósi. 2013. Identification of Behaviour in Freely Moving Dogs Using Inertial Sensors. PloS one 8, 10 (2013), 77814. 10. EF Hiby, NJ Rooney, and JWS Bradshaw. 2004. Dog training methods: their use, effectiveness and interaction with behaviour and welfare. In Animal Welfare. Universities Federation for Animal Welfare, 63 70. 11. Shimmer Inc. 2014. Wearable Sensing Technology. (2014). http://www.shimmersensing.com/ Version 3. 12. Melody M Jackson, Giancarlo Valentin, Larry Freil, Lily Burkeen, Clint Zeagler, Scott Gilliland, Barbara Currier, and Thad Starner. 2015. FIDO Facilitating interactions for dogs with occupations: wearable communication interfaces for working dogs. In Personal and Ubiquitous Computing. Springer-Verlag, 155 173. 13. Cassim Ladha, Nils Hammerla, Emma Hughes, Patrick Olivier, and Thomas Plötz. 2013. Dog s life: wearable activity recognition for dogs. In Proc. of the ACM international joint conference on Pervasive and ubiquitous computing. ACM, Zurich, 415 418. 14. Paul E Miller and Christopher J Murphy. 1995. Vision in dogs. In Journal-American Veterinary Medical Association, Vol. 207. AVMA, 1623 1634. 15. Jay Neitz, Timothy Geist, and Gerald H Jacobs. 1989. Color vision in the dog. In Visual neuroscience, Vol. 3. Cambridge Univ Press. 16. Charles Sanders Peirce. 1974. Collected papers of charles sanders peirce. Vol. 5. Harvard University Press, NY. 17. Clarence J Pfaffenberger, JP Scott, JL Fuller, BE Ginsburg, and SW Biefelt. 1976. Guide dogs for the blind: their selection, development, and training. Elsevier Scientific Publishing Company., NY. 18. Karen Pryor. 2009. Reaching the animal mind: clicker training and what it teaches us about all animals. Simon and Schuster, NY. 19. Cristina Ribeiro, Alexander Ferworn, Mieso Denko, and James Tran. 2009. Canine Pose Estimation: A Computing for Public Safety Solution. In Computer and Robot Vision. Canadian Conference on. IEEE, NJ, 37 44. 20. Cristina Ribeiro, Alexander Ferworn, Mieso Denko, James Tran, and Chris Mawson. 2008. Wireless estimation of canine pose for search and rescue. In System of Systems Engineering, 2008. SoSE 08. IEEE International Conference on. IEEE, NJ, 1 6. 21. Alexandre Pongrácz Rossi and César Ades. 2008. A dog at the keyboard: using arbitrary signs to communicate requests. Animal Cognition (2008), 329 338. 22. Giancarlo Valentin. 2014. Gestural activity recognition for canine-human communication. In Proceedings of the ACM International Symposium on Wearable Computers: Adjunct Program. ACM, 145 149. 23. Giancarlo Valentin, Joelle Alcaidinho, Ayanna Howard, Melody M Jackson, and Thad Starner. 2015b. Towards a Canine-Human Communication System Based on Head Gestures. In 12th International Conference on Advances in Computer Entertainment Technology, Vol. 1. ACI, ACM, California, 1 6. 24. Giancarlo Valentin, Joelle Alcaidinho, and Melody Moore Jackson. 2015a. The challenges of wearable computing for working dogs. In Proceedings of the ACM International Symposium on Wearable Computers. ACM, 1279 1284. 25. Kyoko Yonezawa, Takashi Miyaki, and Jun Rekimoto. 2009. Cat Log: sensing device attachable to pet cats for supporting human-pet interaction. In Proceedings of the International Conference on Advances in Computer Enterntainment Technology. ACM, California, 149 156. 107