Electronic Acknowledgement Receipt - Peopleyang/paper/UCB-BSN... · 2011-03-30 · This...

Electronic Acknowledgement Receipt

6577791EFS ID:

Application Number: 12631714

Confirmation Number: 2361

International Application Number:

Title of Invention: SYSTEM FOR DETECTION OF BODY MOTION

First Named Inventor/Applicant Name: Ruzena Bajcsy

Customer Number: 37490

Application Type: Utility under 35 USC 111(a)

Time Stamp: 19:39:59

Filing Date:

Receipt Date: 04-DEC-2009

Attorney Docket Number: 010030-002710US

Filer Authorized By: Charles J. Kulas

Filer: Charles J. Kulas/Megan Godsey

Payment information:Submitted with Payment no

File Listing:

Document Number Document Description File Name File Size(Bytes)/

Message DigestMulti

Part /.zipPages

(if appl.)

1 Application Data Sheet 010030-002710US-B08-082-2-ADS.pdf

1185052no 5

ab707445f83afef538b39db5d6d1697966f3e7ea

Warnings:

Information:

2 Drawings-only black and white line drawings

010030-002710US-B08-082-2-Figures.pdf

1324283no 15

230ab0f9ee1c7c9b9807d6a9a0d89e666464296b

Warnings:

Information:

3 Information Disclosure Statement (IDS) Filed (SB/08)

010030-002710US-B08-082-2-IDS.pdf

777024no 4

cb363f71b2ddb0e4c6450e87c1ec40a84ce2b79b

Warnings:

Information:

A U.S. Patent Number Citation or a U.S. Publication Number Citation is required in the Information Disclosure Statement (IDS) form for autoloading of data into USPTO systems. You may remove the form to add the required data in order to correct the Informational Message if you are citing U.S. References. If you chose not to include U.S. References, the image of the form will be processed and be made available within the Image File Wrapper (IFW) system. However, no data will be extracted from this form. Any additional data such as Foreign Patent Documents or Non Patent Literature will be manually reviewed and keyed into USPTO systems.

4 NPL Documents 010030-002710US-Reference1-CVPR4HB-AllenYang.pdf

616677no 8

c5f378852f60f98a3ba3fc46ba564393e48fed20

Warnings:

Information:

5 NPL Documents 010030-002710US-Reference2-JAISE08-AllenYang.pdf

435447no 13

a32457a386377ea074f444c1abe62baf22c0da56

Warnings:

Information:

6 NPL Documents 010030-002710US-Reference3-EECS-2007-143.pdf

1220691no 19

85cebc28099ff5e89f36e84b1f5091ba21050a19

Warnings:

Information:

7 NPL Documents 010030-002710US-Reference4-spots09-paper6.pdf

2340714no 11

7afd37e789b4c26477ec6aec5dd79d7857b83a9d

Warnings:

Information:

8010030-002710US-B08-082-2-

SystemforDetectionofBodyMotion-v6.pdf

198715yes 36

e6475477f29832165778b2ee1913d976948a0ce6

Multipart Description/PDF files in .zip description

Document Description Start End

Specification 1 33

Claims 34 35

Abstract 36 36

Warnings:

Information:

Total Files Size (in bytes): 8098603

This Acknowledgement Receipt evidences receipt on the noted date by the USPTO of the indicated documents, characterized by the applicant, and including page counts, where applicable. It serves as evidence of receipt similar to a Post Card, as described in MPEP 503. New Applications Under 35 U.S.C. 111 If a new application is being filed and the application includes the necessary components for a filing date (see 37 CFR 1.53(b)-(d) and MPEP 506), a Filing Receipt (37 CFR 1.54) will be issued in due course and the date shown on this Acknowledgement Receipt will establish the filing date of the application. National Stage of an International Application under 35 U.S.C. 371 If a timely submission to enter the national stage of an international application is compliant with the conditions of 35 U.S.C. 371 and other applicable requirements a Form PCT/DO/EO/903 indicating acceptance of the application as a national stage submission under 35 U.S.C. 371 will be issued in addition to the Filing Receipt, in due course. New International Application Filed with the USPTO as a Receiving Office If a new international application is being filed and the international application includes the necessary components for an international filing date (see PCT Article 11 and MPEP 1810), a Notification of the International Application Number and of the International Filing Date (Form PCT/RO/105) will be issued in due course, subject to prescriptions concerning national security, and the date shown on this Acknowledgement Receipt will establish the international filing date of the application.

EFS Web 2.2.2

PTO/SB/14 (07-07)

Approved for use through 06/30/2010. OMB 0651-0032

U.S. Patent and Trademark Office; U.S. DEPARTMENT OF COMMERCE

Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless it contains a valid OMB control number.

Application Data Sheet 37 CFR 1.76 Attorney Docket Number

Application Number

Title of Invention

The application data sheet is part of the provisional or nonprovisional application for which it is being submitted. The following form contains the

bibliographic data arranged in a format specified by the United States Patent and Trademark Office as outlined in 37 CFR 1.76.

This document may be completed electronically and submitted to the Office in electronic format using the Electronic Filing System (EFS) or the

document may be printed and included in a paper filed application.

Secrecy Order 37 CFR 5.2

Portions or all of the application associated with this Application Data Sheet may fall under a Secrecy Order pursuant to

37 CFR 5.2 (Paper filers only. Applications that fall under Secrecy Order may not be filed electronically.)

Applicant Information:

Applicant

Inventor Legal Representative under 35 U.S.C. 117 Party of Interest under 35 U.S.C. 118Applicant Authority

SuffixPrefix Given Name Middle Name Family Name

Residence Information (Select One) US Residency Non US Residency Active US Military Service

City State/Province Country of Residence

Citizenship under 37 CFR 1.41(b)

Mailing Address of Applicant:

CountryPostal Code

Address 1

Address 2

City State/Province

Applicant







CountryPostal Code

Address 1

Address 2

City State/Province

Applicant





010030-002710US

SYSTEM FOR DETECTION OF BODY MOTION

1

Ruzena Bajcsy

Berkeley CA USi

USi

US94720

665 Soda Hall

Berkeley CA

i

2

Allen Y. Yang

Berkeley CA USi

CNi

US94720

307 Cory Hall

Berkeley CA

i

3

S. Shankar Sastry

Berkeley CA USi

EFS Web 2.2.2

PTO/SB/14 (07-07)





Application Number

Title of Invention



CountryPostal Code

Address 1

Address 2

City State/Province

Applicant







CountryPostal Code

Address 1

Address 2

City State/Province

All Inventors Must Be Listed - Additional Inventor Information blocks may be

generated within this form by selecting the Add button.

Correspondence Information:

Enter either Customer Number or complete the Correspondence Information section below.

For further information see 37 CFR 1.33(a).

An Address is being provided for the correspondence Information of this application.

Customer Number

Email Address

Application Information:

Title of the Invention

Attorney Docket Number Small Entity Status Claimed

Application Type

Subject Matter

Suggested Class (if any) Sub Class (if any)

Suggested Technology Center (if any)

Total Number of Drawing Sheets (if any) Suggested Figure for Publication (if any)

010030-002710US


USi

US94720

514 Cory Hall

Berkeley CA

i

4

Roozbeh Jafari

Richardson TX USi

IRi

US75080

800 W. Campbell Road, EC33

Richardson TX

i

Add

37490

Remove EmailAdd Email


010030-002710US

15 19

Nonprovisional

Utility

EFS Web 2.2.2

PTO/SB/14 (07-07)





Application Number

Title of Invention

Publication Information:

Request Early Publication (Fee required at time of Request 37 CFR 1.219)

Request Not to Publish. I hereby request that the attached application not be published under 35 U.S.

C. 122(b) and certify that the invention disclosed in the attached application has not and will not be the subject of

an application filed in another country, or under a multilateral international agreement, that requires publication at

eighteen months after filing.

Representative Information:

Representative information should be provided for all practitioners having a power of attorney in the application. Providing

this information in the Application Data Sheet does not constitute a power of attorney in the application (see 37 CFR 1.32).

Enter either Customer Number or complete the Representative Name section below. If both sections

are completed the Customer Number will be used for the Representative Information during processing.

Customer Number US Patent Practitioner Limited Recognition (37 CFR 11.9)Please Select One:

Customer Number

Domestic Benefit/National Stage Information:

This section allows for the applicant to either claim benefit under 35 U.S.C. 119(e), 120, 121, or 365(c) or indicate National Stage

entry from a PCT application. Providing this information in the application data sheet constitutes the specific reference required by

35 U.S.C. 119(e) or 120, and 37 CFR 1.78(a)(2) or CFR 1.78(a)(4), and need not otherwise be made part of the specification.

Prior Application Status

Application Number Continuity Type Prior Application Number Filing Date (YYYY-MM-DD)

Additional Domestic Benefit/National Stage Data may be generated within this form

by selecting the Add button.

Foreign Priority Information:

This section allows for the applicant to claim benefit of foreign priority and to identify any prior foreign application for which priority is

not claimed. Providing this information in the application data sheet constitutes the claim for priority as required by 35 U.S.C. 119(b)

and 37 CFR 1.55(a).

Application Number Country Parent Filing Date (YYYY-MM-DD) Priority Claimed

Yes No

Additional Foreign Priority Data may be generated within this form by selecting the

Add button.

Assignee Information: Providing this information in the application data sheet does not substitute for compliance with any requirement of part 3 of Title 37

of the CFR to have an assignment recorded in the Office.

.

Assignee

010030-002710US


37490

Pending Remove

non provisional of 61119861 2008-12-04

Remove

i

1

EFS Web 2.2.2

PTO/SB/14 (07-07)





Application Number

Title of Invention

If the Assignee is an Organization check here.

Organization Name

Mailing Address Information:

Address 1

Address 2

City State/Province

Country Postal Code

Phone Number Fax Number

Email Address

Additional Assignee Data may be generated within this form by selecting the Add

button.

Signature:

A signature of the applicant or representative is required in accordance with 37 CFR 1.33 and 10.18. Please see 37

CFR 1.4(d) for the form of the signature.

Signature Date (YYYY-MM-DD)

First Name Last Name Registration Number

This collection of information is required by 37 CFR 1.76. The information is required to obtain or retain a benefit by the public which

is to file (and by the USPTO to process) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR 1.14. This

collection is estimated to take 23 minutes to complete, including gathering, preparing, and submitting the completed application data

sheet form to the USPTO. Time will vary depending upon the individual case. Any comments on the amount of time you require to

complete this form and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S. Patent and

Trademark Office, U.S. Department of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR

COMPLETED FORMS TO THIS ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.

010030-002710US


The Regents of the University of California

1111 Franklin Street, 12th Floor

Oakland CA

US 94607-5200i

35809Kulas

/Charles J. Kulas/ 2009-12-03

Charles

Privacy Act Statement

EFS Web 2.2.2

The Privacy Act of 1974 (P.L. 93-579) requires that you be given certain information in connection with your submission of the attached form related to

a patent application or patent. Accordingly, pursuant to the requirements of the Act, please be advised that: (1) the general authority for the collection

of this information is 35 U.S.C. 2(b)(2); (2) furnishing of the information solicited is voluntary; and (3) the principal purpose for which the information is

used by the U.S. Patent and Trademark Office is to process and/or examine your submission related to a patent application or patent. If you do not

furnish the requested information, the U.S. Patent and Trademark Office may not be able to process and/or examine your submission, which may

result in termination of proceedings or abandonment of the application or expiration of the patent.

The information provided by you in this form will be subject to the following routine uses:

1. The information on this form will be treated confidentially to the extent allowed under the Freedom of Information Act (5 U.S.C. 552)

and the Privacy Act (5 U.S.C. 552a). Records from this system of records may be disclosed to the Department of Justice to determine

whether the Freedom of Information Act requires disclosure of these records.

2. A record from this system of records may be disclosed, as a routine use, in the course of presenting evidence to a court, magistrate, or

administrative tribunal, including disclosures to opposing counsel in the course of settlement negotiations.

3. A record in this system of records may be disclosed, as a routine use, to a Member of Congress submitting a request involving an

individual, to whom the record pertains, when the individual has requested assistance from the Member with respect to the subject matter of

the record.

4. A record in this system of records may be disclosed, as a routine use, to a contractor of the Agency having need for the information in

order to perform a contract. Recipients of information shall be required to comply with the requirements of the Privacy Act of 1974, as

amended, pursuant to 5 U.S.C. 552a(m).

5. A record related to an International Application filed under the Patent Cooperation Treaty in this system of records may be disclosed,

as a routine use, to the International Bureau of the World Intellectual Property Organization, pursuant to the Patent Cooperation Treaty.

6. A record in this system of records may be disclosed, as a routine use, to another federal agency for purposes of National Security

review (35 U.S.C. 181) and for review pursuant to the Atomic Energy Act (42 U.S.C. 218(c)).

7. A record from this system of records may be disclosed, as a routine use, to the Administrator, General Services, or his/her designee,

during an inspection of records conducted by GSA as part of that agency's responsibility to recommend improvements in records

management practices and programs, under authority of 44 U.S.C. 2904 and 2906. Such disclosure shall be made in accordance with the

GSA regulations governing inspection of records for this purpose, and any other relevant (i.e., GSA or Commerce) directive. Such

disclosure shall not be used to make determinations about individuals.

8. A record from this system of records may be disclosed, as a routine use, to the public after either publication of the application pursuant

to 35 U.S.C. 122(b) or issuance of a patent pursuant to 35 U.S.C. 151. Further, a record may be disclosed, subject to the limitations of 37

CFR 1.14, as a routine use, to the public if the record was filed in an application which became abandoned or in which the proceedings were

terminated and which application is referenced by either a published application, an application open to public inspections or an issued

patent.

9. A record from this system of records may be disclosed, as a routine use, to a Federal, State, or local law enforcement agency, if the

USPTO becomes aware of a violation or potential violation of law or regulation.

Attorney Docket No.: 010030-002710US

Client Reference No.: B08-082-2

PATENT APPLICATION


INVENTORS: Ruzena Bajcsy, a citizen of the USA, residing at: 665 Soda Hall Berkeley, CA 94720 Allen Y. Yang, a citizen of China, residing at: 307 Cory Hall Berkeley CA 94720 S. Shankar Sastry, a citizen of the USA, residing at: 514 Cory Hall Berkeley, CA 94720 Roozbeh Jafari, a citizen of Iran, residing at: 800 W. Campbell Rd., EC33 Richardson, TX 75080

Please direct communications to:

Trellis Intellectual Property Law Group, PC

1900 Embarcadero Rd. Suite 109

Palo Alto, CA 94303

Phone: 650-842-0300

ASSIGNEE: The Regents of the University of California

ENTITY: Small



1


Acknowledgement of Government Support

[01] This invention was made with Government support under Office of US Army

Research Laboratory Grant No. MURIW911NF06-1-0076. The Government may have

certain rights to this invention.

Claim of Priority

[02] This application claims priority from U.S. Provisional Patent Application Serial

No. 61/119,861, entitled SYSTEM FOR DETECTION OF BODY MOTION, filed on

December 4, 2008, which is hereby incorporated by reference as if set forth in full in this

application for all purposes.

Copyright Disclaimer

[03] A portion of the disclosure recited in the specification contains material which

may be subject to copyright protection. Specifically, a functional language such as

computer source code, pseudo-code or similar executable or design language may be

provided. The copyright owner has no objection to the facsimile reproduction of the

specification as filed in the Patent and Trademark Office. Otherwise all copyright rights

are reserved.



2

Background

[04] Motion sensors and specialized processing can be used to measure and classify

the actions of persons or objects. For example, multiple sensors can be placed at body

locations such as wrists, ankles, midsection, etc. By analyzing the motion measured by

each sensor the subject’s overall body movement or action can be determined.

[05] Some sensor-based action recognition approaches utilize a single sensor while

others use multiple sensors mounted in different locations to improve the accuracy of

overall action recognition. Action recognition systems typically include feature extraction

and classification processing that can be either distributed or centralized. However,

conventional approaches may not have sufficient accuracy in recognizing the actions of a

body or object for many modern applications.

[06] Human action detection is useful in many applications such as medical-care

monitoring, athlete training, tele-immersion, human-computer interaction, virtual reality,

motion capture, etc. In some applications, such as medical-care monitoring that takes

place in a user’s home, it may be desirable to maintain a low-cost system with a minimal

number of sensors, and to reduce resource use such as processing power, bandwidth, cost,

etc., while still maintaining desired accuracy and performance.

Brief Description of the Drawings

Figure 1 illustrates an example wireless body sensor arrangement.

Figure 2 illustrates an example motion sensor system.

Figure 3 illustrates an example of action duration variations.



3

Figure 4 illustrates example waveforms of accelerometer and gyroscope readings for two

repetitive stand-kneel-stand actions.

Figure 5 illustrates an example one-dimensional manifold model using a two-dimensional

subspace.

Figure 6 illustrates an example sparse l1 solution for a stand-to-sit action on a waist

sensor with corresponding residuals.

Figure 7 illustrates example waveforms for a multiple segmentation hypothesis on a wrist

sensor node.

Figure 8 illustrates an example invalid representation waveform.

Figure 9 illustrates a flow diagram of an example method of determining a motion using

distributed sensors.

Figure 10 illustrates example waveforms for an x-axis accelerometer reading for a stand-

sit-stand action.

Figure 11 illustrates example waveforms for an x-axis accelerometer reading for a sit-lie-

sit action.

Figure 12 illustrates example waveforms for an x-axis accelerometer reading for a bend

down action.

Figure 13 illustrates example waveforms for an x-axis accelerometer reading for a kneel-

stand-kneel action.

Figure 14 illustrates example waveforms for an x-axis accelerometer reading for a turn

clockwise then counter action.




4

clockwise 360o action.


counter clockwise 360o action.

Figure 17 illustrates example waveforms for an x-axis accelerometer reading for a jump

action.

Figure 18 illustrates example waveforms for an x-axis accelerometer reading for a go

upstairs action.

Figure 19 illustrates basic components and subsystems in a basic description of a system

suitable for practicing the invention.

Detailed Description of Embodiments of the Invention

[07] In particular embodiments, body motions are determined by using one or more

distributed sensors or sensor nodes. Although a preferred embodiment of the invention

uses accelerometer and global positioning system (GPS) position sensors, features

described herein may be adaptable for use with any other suitable types of sensors or

position sensing systems such as, triangulation or point-of-reference sensors (e.g.,

infrared, ultrasound, radio-frequency, etc.), imaging (e.g., video image recognition),

mechanical sensing (e.g., joint extension, shaft encoders, linear displacement, etc.),

magnetometers or any other type of suitable sensors or sensing apparatus. Sensors may be

included with other functionality such as in a cell phone or other electronic device. In

general, various modifications, substitutions or other variations from the particular

embodiments described herein will be acceptable and are included within the scope of the

claims.

[08] In one embodiment of the present invention, a body action recognition system



5

functions on a single cell phone device, such as Apple iPhone®, GOOGLE gPhoneTM

,

Nokia N800/900TM

, etc. Other mobile devices that include motion sensors can also be

employed, such as PDA’s, personal navigation systems, etc. While some functionality is

possible with standard cell phones, smart phones typically includes integrated motion

sensors, a processor and memory. These components can be sufficient to support an

implementation of the body action recognition algorithm. Particular embodiments may

be able to employ whatever capacity a particular cell phone device happens to be

equipped with to make an evaluation of the owner's physical movements. By example,

ranging location can be provided by cell towers, WiFi locations, GPS sensors, etc.

Magnetometers providing magnetic North indication can be employed by the present

invention, as well as a gyroscope and other position sensors. Wireless links which

provide useful input to the presents invention can include Bluetooth, ZigBeeTM

, etc.

[09] The cell phone can be placed at a specified body position such as around the

waist or the neck. The software reads motion sensor data directly from the onboard

sensor, and executes a classification algorithm via the processor. Information on the

statistics or nature of the action classification can be visually or audibly presented on the

device to indicate to the user activity level, warnings of dangerous situations (e.g.,

unsteady gait), health status or other information that can be derived from the action

recognition.

[10] In addition, a wireless connection between the cell phone and a base station

computer can be established to transmit and store classification results. When potentially

harmful human actions are detected (e.g., a fall), alert information (e.g., nature of alert,

location, verbal requests) can be transmitted to the monitoring station that can be

subsequently forwarded to emergency responders, preferred personal contacts, health

care personal or other preferred contacts. During an alert event, continuous sensor data

can be transmitted to a base station that has a more powerful processing capability (more

powerful processor and access to a larger human actions database) to validate the human



6

action classification and reduce false alerts.

[11] In one embodiment, sensors are worn on a user’s body. Each sensor’s data is

relayed to a local, in-home, base station and then to a remote managing facility.

Preliminary processing can be performed at one or more points in the transfer or relay of

sensor data. For example, sensor data can be subjected to sensor-level processing by

associating one or more of the sensors with resources such as a digital processor,

memory, etc. In one approach, one or more sensors are included in a sensor node

assembly (when highly miniaturized may be referred to as a “mote”) that can include

processing resources and data communication resources such as a wireless transceiver. A

body-level controller that includes functions such as a wireless transceiver, cell phone,

personal digital assistant (PDA), GPS unit or other customized unit or device worn on the

body can also perform preliminary processing in addition to, or in place of, the sensor-

level processing. A local base station can receive the sensor data and perform additional

processing. The local sensor data is transferred to the remote managing facility where

further processing and analysis can be performed to make a final determination of body

motion or actions based on the sensor data.

[12] In one embodiment, the preliminary processing at the sensor, body or local

levels acts to analyze and filter data that is not deemed to be of substantial significance in

ultimate determination of a body motion or action. Other actions can be performed by

the preliminary processing such as optimizing, calibrating (e.g., normalizing) or

otherwise adjusting the raw sensor data in order to aid in efficient motion analysis.

Sensor data may be combined or transferred at the sensor, body, local or other lower

levels in order to facilitate analysis.

[13] In a preferred embodiment feature extraction and classification functions can be

performed at various levels (e.g., local or global) in the system in order to reduce

communication bandwidth requirements and sensor node power consumption. In the

preferred embodiment a common classification approach can be used at the local and



7

global level simplifying system design while also improving classification accuracy. By

modeling the distribution of multiple classes as a mixture subspace model, i.e., one

subspace for each action class, we seek the sparest linear representation of the sample

with respect to all the training examples. In this model the dominant coefficients in the

sparse representation correspond to the class of the test sample. If the action cannot be

classified locally the action can be transmitted to a global classifier that uses the same

structure but incorporates additional senor samples to improve classification accuracy.

This method is scalable, i.e., multiple classification levels can be used, robust from the

viewpoint that the processing structure does change when sensors are added or removed,

while at the same time it minimizes communication requirements.

[14] Figure 19 illustrates basic parts of a system for performing body motion

identification according to embodiments of the invention. In Figure 19, system 1900

includes body sensors 1920 worn by user 1910. In this example, there are five sensors

located on the right and left wrists as 1920-1 and 1920-2, respectively; right and left

ankles as 1920-3 and 1920-4, respectively, and midsection 1920-5. The sensors are

coupled to a central body controller 1930. Data relay between the sensors and central

body controller can be by wired or wireless communication. If desired, one or more

sensors can be coupled to additional sensor resources such as by using node assemblies to

provide data processing or other functions. Similarly, data transfers between the central

body controller and the local base station can be by wireless communications such as

Bluetooth, Zigbee, Wi-Fi, or the like. Note that, in general, any suitable type of

communication link may be used among any one or more components in the system.

[15] Central controller 1930 relays the information to an access point or local base

station 1940 that is located in or near to the house or other enclosure 1950. A data link

between the local base station is provided via Internet 1960 to managing entity 1970.

Managing entity 1970 uses the received sensor data and database 1980 in order to make a

determination of an action performed by user 1910. Managing entity 1970 can also send



8

data to the local subsystems such as the local base station, central body controller,

sensors, motes, discussed above, in order to control, interrogate or perform other

functions with the subsystems.

[16] Managing entity 1970 can interface with many different local subsystems such

as outdoor user 1990 having a similar set of sensors as those described above for user

1910. Outdoor user 1990’s sensor data is provided to an outdoor base station 1992.

Outdoor base station can be, for example, a cellular network site, satellite in a satellite

telephone network, radio-frequency communication, etc. Many such indoor and outdoor

users can be managed by a single managing entity as illustrated by multiple users at 1994.

Although specific numbers and types of components are illustrated, it should be apparent

that many variations in type and number are possible to achieve the functionality

described herein.

[17] In one embodiment, managing entity 1970 can provide communications to user

1910 via a display and user input device (a “user interface”). For example, the central

body controller can be a cell phone having a display, pointing device, touch screen,

numeric keypad, QWERTY keyboard, etc. Other devices can be used to provide a user

interface such as a desktop personal computer, laptop, sub-notebook, ultra-portable

computing device, etc. (not shown).

[18] As described herein (i.e., in this specification and in the included documents),

preliminary processing may be performed at any of the local subsystems or at other

locations prior to the data reaching managing entity 1970 in order to reduce the amount

of data that is relayed downstream. For example, data filtering (i.e., discarding) can occur

at low levels in the system such as at the sensor, sensor node, central body controller or

other local level of operation. In a preferred embodiment, the computational approach

described herein allows functions such as feature extraction and classification to be used

to identify false data and “outlier” data. Where false data is data that is not desirable for

action classification and wherein outlier data is data that results in a motion classification



9

that is not a motion that is of-interest to the system. For example, if the only

classifications that are desired to be detected are standing, sitting and walking, then data

that results in a determination of a running motion is outlier data. The false data and

outlier data can be prevented from further propagation and use in the system by

performing local preliminary processing. This improves the efficiency of data processing

and can reduce data communication bandwidth requirements and power consumption.

[19] The use of some or all of the functionality described herein allows the system to

adapt to changes in the deletion, addition or modification of sensors and sensor data. For

example, the Adaptive Global Recognition that uses Distributed Sparsity Classifier

(DSC) functions (portions of which may operate at any point in the system) allows the

system to continue to perform effectively when a sensor is turned off, removed,

malfunctioning, broken or otherwise is halted or impaired in its operation. In some cases,

as described herein, performance may actually improve after a sensor is removed or shut

off.

[20] Thus, one benefit of the system is that a user can modify the sensor arrangement

and the system can automatically adjust to perform with the new arrangement. For

example, if a user develops a skin irritation where a sensor is mounted and removes the

sensor the system can adapt to the new configuration and still maintain motion

identification. In this case, once the managing entity determines that a sensor is missing it

can send a message to the user to inform the user of the missing sensor data. The user can

then reply to indicate that the user intended that the system be modified by removing a

sensor or the user may be alerted that a sensor has stopped without the user’s intent.

[21] This type of sensor placement or operation modification in the field is a benefit

to ongoing work with the system. “Hot plugging/unplugging” of sensors can be

performed where a sensor is added or removed without having to power down other parts

of the system or require user modification of system software. Since the managing entity

can detect such changes (while still maintaining operations in view of the changes) the



10

managing entity can then communicate with the user to make sure that the changes were

intended and the results after the changes are acceptable. Another feature allows the

managing entity to turn each sensor on or off remotely in order to test system adaptability

before a change is made.

[22] In one embodiment, one or more sensors are coupled to a body, where each

sensor is positioned at about a designated location on the body, and where each sensor is

configured to acquire motion data related to movement of the designated location on the

body and at which the sensor is positioned, and to reduce the motion data into

compressed and transmittable motion data; and a base station configured to receive the

compressed motion data via wireless communication from at least one of the plurality of

sensors, the base station being further configured to remove outlier information from the

received motion data, and to match the received motion data to a predetermined action,

where the predetermined action indicates a movement of the body.

[23] In one embodiment, a method can include: acquiring motion data related to

movement of a designated location on a body using a sensor ; reducing the motion data

into compressed motion data; transmitting the compressed motion data to a base station

using a wireless connection (the base station may be integrated into the sensor as a

single module); removing outlier information from the transmitted motion data to create

outlier rejected motion data; and matching the outlier rejected motion data to a

predetermined action for indicating a movement of the body.

[24] A distributed recognition approach for classification of movement actions using

an attachable motion sensing and processing network is described herein. For example,

the motion sensor network can be attached to, or wearable by, a human being, and the

sensor network can be relatively low-bandwidth (e.g., in a range of about 250 kbit per

second at 2.4 GHz for the IEEE 802.15.4 protocol). A set of pre-segmented motion

sequences may be utilized as training examples, and an algorithm can substantially

simultaneously segment and classify such human actions. Further, particular



11

embodiments can also reject outlying actions that may not be found in the training set.

[25] The classification in particular embodiments can be operated in a distributed

fashion using individual sensors on a body, and a base station processing system that is

located on the body or remote from the body under detection. In the particular

embodiment, the distribution of multiple action classes satisfies a mixture subspace

model, with one subspace for each action class. Given a new test sample, a relatively

sparse linear representation of the sample with respect to the training examples can be

acquired. In this approach, dominant coefficients in the linear representation may

correspond to an action class of the test sample. Thus, membership of the dominant

coefficients may be encoded in the linear representation. Further, convex optimization

solvers are used to compute such representation via l1-minimization, and have been

known to be very efficient in processing high-dimensional data in linear or quadratic

algorithm complexity. For example, by using up to 8 body sensors, an algorithm in

particular embodiments can achieve state-of-the-art accuracy of about 98.8% on a set of

12 action categories (or with one body sensor, the algorithm can achieve accuracy of

approximately 60 to 90%. However, particular embodiments can support a relatively

large number of different actions (e.g., 10, 20, etc.). In addition, the recognition precision

may decrease gracefully using smaller subsets of sensors, validating distributed

framework robustness.

[26] Particular embodiments can also utilize wired or wireless communication

between each sensor node and the base station. Further, different subsets of sensors that

are available (e.g., due to dropped wireless signals) can be accommodated. Software can

be used on each sensor node for local computations, and on a central computer or base

station. Feature selection or compression of data obtained from each sensor node can be

performed to reduce information. Overall performance in particular embodiments is

gained from a combination of sensor accuracy and outlier rejection.

[27] Applications of particular embodiments include: (i) monitoring activities in



12

elderly people, the disabled and the chronically ill (e.g., remote over a network) for

nursing homes and hospitals; (ii) hospital emergency room monitoring for nursing

coverage; (iii) diagnosis of diseases (e.g., Parkinson’s, etc.); (iv) monitoring of prisoners

or solders (e.g., where sensors are embedded in uniforms), such as via base station

monitoring; (v) athletic training, (vi) monitoring patients in clinical drug studies, (vii)

monitoring of animal activities and (viii) machine monitoring. Of course, particular

embodiments are also amenable to many more applications.

Human Action Recognition Introduction

[28] Human action recognition can be achieved using a distributed wearable motion

sensor network. One approach to action recognition is computer vision. As compared

with a model-based or appearance-based vision system, various aspects distinguish the

body sensor network approach of particular embodiments. In one aspect, the system does

not require adding cameras or other sensor instrumentation to the environment. In

another aspect, the system has the necessary mobility to support continuous monitoring

of a subject during the daily activities of the subject. In another aspect, and with the

continuing miniaturization of mobile processors and sensors, it has become possible to

manufacture wearable sensor networks that densely cover the human body to record and

analyze relatively small movements of the human body (e.g., breathing, spine

movements, heart beats, etc.). Such sensor networks can be used in applications, such as

medical-care monitoring, athlete training, tele-immersion, and human-computer

interaction (e.g., integration of accelerometers in Wii game controllers, smart phones,

etc.).

[29] Figure 1 illustrates an example wireless body sensor arrangement system 100.

For example, sensors 102 can be positioned at designated locations on a body, such as

sensor 102-1 at a waist location, sensor 102-2 at a left wrist location, sensor 102-3 at a

left upper arm location, sensor 102-4 at a right wrist location, sensor 102-5 at a right

ankle location, sensor 102-6 at a right knee location, sensor 102-7 at a left ankle



13

location, and sensor 102-8 at a left knee location.

[30] In some sensor networks, the computation performed by the sensor node is

fairly simple: (i) extract and filter sensor data and (ii) transmit the data to a

microprocessor-based server over a network for processing. In particular embodiments,

distributed pattern recognition is employed, whereby each sensor node can classify local

information. When the local classification detects a possible object or event, the sensor

node can become fully active and transmit the measurement to a centralized server. If

wireless interconnection is employed, it is desirable to reduce power consumption

because, e.g., the power consumption required to successfully send one byte over a

wireless channel is equivalent to executing between 1x103 and 1x10

6 instructions on an

onboard processor. Thus, such sensor networks should reduce communication, while

preserving recognition performance. On the server side, a global classifier can receive

data from the sensor nodes and further optimize the classification. The global classifier

can be more computationally involved than the distributed classifiers, but the global

classifier may also adapt to changes of available network sensors due to local

measurement error, sensor failure, and communication congestion.

[31] Feature extraction in wearable sensor networks can include three major types of

features. The first such feature can involve relatively simple statistics of a signal

sequence, such as the max, mean, variance, and energy. The second such feature may be

computed using fixed filter banks (e.g., fast Fourier transform (FFT), finite impulse

response filters, wavelets, etc.). The third such feature may be based on classical

dimensionality reduction techniques (e.g., principal component analysis (PCA), linear

discriminant analysis (LDA), etc.).

[32] In terms of classification on the action features, some approaches have used,

e.g., thresholding or k-nearest-neighbor (kNN), due to the simplicity of the algorithms for

mobile devices. Other more sophisticated techniques have also been used, such as

decision trees and hidden Markov models.



14

[33] For distributed pattern recognition, distributed speech recognition and

distributed expert systems have been used, but for most distributed sensor systems, each

local observation from the distributed sensors is biased and insufficient to classify all

classes of actions. For example, the sensors placed on a lower-body may not perform

well in classification of those actions that mainly involve upper body motions (and vice

versa). Consequently, traditional majority-voting type classifiers may not achieve the best

performance globally.

[34] Figure 2 illustrates an example motion sensor system 200. Any suitable number

of sensors 102 (e.g., 102-1, 102-2, … 102-N) can be located at designated positions on a

body for motion sensing. The sensors 102 that are active may then transmit information

to base station 202. For example, each sensor 102 can include accelerometer 210 and

gyroscope 204. Controller 206 can receive motion data from accelerometer 210 and

gyroscope 204, and may provide information to transmitter 208 for transmission to base

station 202.

[35] Thus, design of a wearable sensor network in particular embodiments can

include: (i) sensors placed at various body locations, which communicate with (ii) a base

station that can communicate with a computer server. For example, the base station and

computer server can be connected through a universal serial bus (USB) port, or any other

suitable connection (e.g., a wireless connection). Further, the sensors and base station

may be built using commercially available products such as Tmote SkyTM

boards from

companies such as Sentilla Corporation of Redwood City, California. Such products can

run software such as TinyOS on an 8 MHz microcontroller with 10K random-access

memory (RAM) and communicates using the 802.15.4 wireless protocol. Each sensor

node can include a triaxial accelerometer and a biaxial gyroscope, which may be attached

to the Tmote SkyTM

board. In this example, each axis is reported as a 12-bit value to the

sensor , thus indicating values in the range of +/- 2g for the accelerometer, and +/- 500o/s

for the gyroscope.



15

[36] To avoid packet collision in the wireless channel, a time division multiple

access (TDMA) protocol can be used to allocate each sensor node a specific time slot

during which to transmit data. This allows transmission of sensor data at about 20 Hz

with minimal packet loss. To avoid drift in the network, the base station can periodically

broadcast a packet to resynchronize individual timers for sensor each node. The code to

interface with the sensors and transmit data may be implemented directly on a mote using

nesC, a variant of the C programming language. Any other suitable hardware and

software approach can be suitable.

[37] In one example, a set of L wearable sensor nodes with triaxial accelerometers

and biaxial gyroscopes are attached to the human body. For example, denote

5))(),(),(),(),(()( ℜ∈= T

llllll tttztytxta ρθ as the measurement of the five sensors on

node l at time t, and LTT

L

TTtatatata

5

21 ))(,),(),(()( ℜ∈= Κ collects all sensor

measurements. Further, denote lLlaaas

×ℜ∈= 5))(,),2(),1(( Κ as an action sequence of

length l. Given K different classes of human actions, a set of ni training examples

{inii ss ,1, ,,Κ } can be collected for each ith class. The durations of the sequences

naturally may be different. Given a new test sequence s that may contain multiple actions

and possible other outlying actions, a distributed algorithm can be used to substantially

simultaneously segment the sequence and classify the actions.

[38] Solving this problem mainly involves challenges of simultaneous segmentation

and classification, variation of action durations, identity independence, and distributed

recognition. Simultaneous segmentation and recognition from a long motion sequence

can be achieved, where the test sequence may contain other unknown actions that are not

from the K classes. An algorithm in particular embodiments can be robust as to these

outliers.

[39] Figure 3 illustrates an example of action duration variations 300. For variation

of action durations, where the durations of different actions can vary dramatically in



16

practice, a difficulty in segmentation of actions may exist in determining duration of a

proper action. In addition to the variation of action durations, different people move

differently for the same actions, for identity independence.

[40] Figure 4 illustrates example waveforms 400 of accelerometer and gyroscope

readings for two repetitive stand-kneel-stand actions. For a test sequence, identity-

independent performance can be seen by excluding the training samples of the same

subject. Figure 4 shows readings of x-axis accelerometers (first and third diagrams) and

x-axis gyroscopes (second and fourth diagrams) from eight distributed sensors on two

repetitive stand-kneel-stand actions or sequences from two subjects.

[41] A distributed recognition system may also consider: (i) how to extract compact

and accurate low-dimensional action features for local classification and transmission

over a band-limited network; (ii) how to classify the local measurement in real time using

low-power processors; and (iii) how to design a classifier to globally optimize the

recognition and be adaptive to the change of the network.

[42] In particular embodiments, a distributed action recognition algorithm can

simultaneously segment and classify 12 human actions using up to 8 wearable motion

sensors. This approach utilizes an emerging theory of compressed sensing and sparse

representation, where each action class can satisfy a low-dimensional subspace model.

For example, a 10-D linear discriminant analysis (LDA) feature space may suffice to

locally represent 12 action subspaces on each node. If a linear representation is sought to

represent a valid test sample with respect to all training samples, the dominant

coefficients in the sparsest representation correspond to the training samples from the

same action class, and hence they encode the membership of the test sample.

[43] In one example system, three integrated components can be employed: (i) a

multi-resolution action feature extractor; (ii) fast distributed classifiers via l1-

minimization; and (iii) an adaptive global classifier. Particular embodiments can include



17

a method to accurately segment and classify human actions from a continuous motion

sequence. The local classifiers that reject potential outliers can reduce the sensor-to-

server communication requirements by approximately 50%. Further any subsets of the

sensors can be activated or deactivated on-the-fly, due to user control, sensor failure,

and/or network congestion. The global classifier may adaptively update the optimization

process and improve the overall classification upon available local decision.

[44] Particular embodiments can also support a public database and/or benchmark in

order to judge the performance and safeguard the reproducibility of extant algorithms for

action recognition using wearable sensors in pattern recognition. For example, a public

benchmark system may be referred to as a “Wearable Action Recognition Database”

(WARD). Such a database may contain many human or other suitable subjects across

multiple age groups, and be made available via the Internet.

Classification via Sparse Representation

[45] Classification via sparse representation can include an efficient action

classification method on each sensor node, where action sequences are pre-segmented.

Given an action segment of length l from node j, l

jjjj laaas ×ℜ∈= 5))(,),2(),1(( Κ , a

new vector can be defined:

Equation 1: lTT

j

T

j

T

j

S

j laaas 5))(,,)2(,)1(( ℜ∈= Κ& , as the stacking of the l

columns of js (where js can be interchangeably used to denote stacked vector S

js ).

[46] Since the length l can vary among different subjects and actions, l can be

normalized to be substantially the same for all the training and test samples. For

example, this can be achieved by oversampling filtering such as by linear interpolation,

FFT interpolation, etc., or by other suitable techniques. After normalization, the



18

dimension of samples sj can be denoted as Dj = 5l. Subsequently, a full-body action

vector v can be defined that stacks the measurement from all L nodes:

Equation 2: DTT

L

TT sssv ℜ∈= ),,,(21

Κ , where lLDDD L 51 =++= Κ .

[47] In particular embodiments, the samples v in an action class may satisfy a

subspace model, called an action subspace. If the training samples {invv ,,1 Κ } of the ith

class sufficiently span the ith action subspace, given a test sample DT

L

Tyyy ℜ∈= ),,( 1 Κ

in the same class i, y can be linearly represented using the training examples of the same

class:

Equation 3:

=

⇔++=

ii

ii

nnLLL

nn

s

s

s

s

s

s

y

y

y

vvy

α

α

α

ααΜΜ

ΛΜΜ

Κ2

1

2

1

1

2

1

2

1

11 .

Also, such linear constraints may also hold on each node j:

j

ii

D

njnjj vsy ℜ∈++= ,1,1 αα Κ .

[48] Complex data, such as human actions, typically includes complex nonlinear

models. The linear models may be used to approximate such nonlinear structures in a

higher-dimensional subspace, as shown in Figure 5 (e.g., a one-dimensional manifold

model 500 using a two-dimensional subspace). Such linear approximation may not

produce good estimation of the distance/similarity metric for the samples on the

manifold. However, as shown in the example below, given sufficient samples on the

manifold as training examples, a new test sample can be accurately represented on the

subspace, provided that any two classes do not have similar subspace models.

[49] To recover label(y), one way is to reformulate the recognition using a global

sparse representation. Since label(y) = i is unknown, y can be represented using all the



19

training samples from all K classes:

Equation 4: ( ) Ax

x

x

x

AAAy

K

K =

=Μ

Κ2

1

21 ,

where i

i

nD

niiii vvvA×ℜ∈= ),,,( ,2,1, Κ collects the training samples of class i,

i

i

nT

niiiix ℜ∈= ),,,( ,2,1, ααα Κ collects the corresponding coefficients in Equation 3

above, and nD

A×ℜ∈ , where Knnnn +++= Κ21 . Since y satisfies both Equations 3 and

4, one solution of x in Equation 4 can be TT

ixx )0,0,,0,,0(* ΚΚ= . The solution is

naturally relatively sparse, where on average only 1/K terms in *x are nonzero values.

[50] On each sensor j, solution *x in Equation 4 is also a solution for the

representation:

Equation 5: ( ) xA

x

x

x

AAAyj

K

jjj

j K

)(2

1

)()()(

21=

=Μ

Κ ,

where ij

i

nDjA

×ℜ∈)(

includes row vectors in Ai that correspond to the jth node. Hence,

*x can be solved either globally using Equation 4, or locally using Equation 5, provided

that the action data measured on each sensor node are sufficiently discriminant. Local

classification versus global classification will be discussed in more detail below.

[51] As to local classification in each sensor node, one major difficulty in solving

Equation 5 is the high dimensionality of the action data. In compressed sensing, one

reduces the dimension of a linear system by choosing a linear projection jDd

jR×

ℜ∈ :



20

Equation 6: djj

jjjj xAxARyRy ℜ∈=== )()( ~~ & .

For example, these matrices may be computed offline and simply stored on each sensor

node, and Rj may not be computed on the sensor node.

[52] After projection Rj, the feature dimension d may be much smaller than the

number n of all training samples. Therefore, the new linear system of Equation 6 may be

underdetermined. Numerically, stable solutions exist to uniquely recover sparse solutions

*x via l1-minimization:

Equation 7: 1

* minarg xx = subject to xAyj

j

)(~~ = .

[53] In one experiment, multiple projection operators were tested, including PCA,

LDA, and a random project. This experiment resulted in the finding that 10-D feature

spaces using LDA lead to the best recognition in a very low-dimensional space. After the

(sparsest) representation x is recovered, the coefficients can be projected onto each action

subspaces:

Equation 8: nTT

ii xx ℜ∈= )0,0,,0,,0()( ΚΚδ , .,,1 Ki Κ=

Finally, the membership of the test sample yj may be assigned to the class with the

smallest residual:

Equation 9: label2

)( )(~~minarg)( xAyy i

j

jij δ−= .

[54] In one experiment, 12 action categories were designed: (1) stand-to-sit, (2) sit-

to-stand, (3) sit-to-lie, (4) lie-to-sit, (5) stand-to-kneel, (6) kneel-to-stand, (7) rotate-right,

(8) rotate-left, (9) bend, (10) jump, (11) upstairs, and (12) downstairs. More details on an

example experiment setup are shown below.

[55] To implement l1-minimization on a sensor node, suitable fast sparse solvers can



21

be used. In testing a variety of methods, such as (orthogonal) matching pursuit (MP),

basis pursuit (BP), LASSO, and a quadratic log-barrier solver, it was found that BP gives

a favorable trade-off between speed, noise tolerance, and recognition accuracy.

[56] To demonstrate the accuracy of the BP-based algorithm on each sensor node

(see, e.g., Figure 1 for example sensor node locations on a body), the actions can be

manually segmented from a set of long motion sequences from three subjects. In total,

there are 626 samples in the data set in this particular example. The 10-D feature

selection is via LDA, and the classification may be substantially identity-independent.

The accuracy of this example classification on each node over 12 action classes is shown

below in Table 1.

Table 1: Recognition accuracy on each node over 12 action classes

Sensor number 1 2 3 4 5 6 7 8

Accuracy (%) 99.9 99.4 99.9 100 95.3 99.5 93 100

[57] Figure 6 illustrates an example sparse l1 solution 600 for a stand-to-sit action on

a waist sensor node (top diagram), with corresponding residuals (bottom diagram). This

represents an example of the estimated sparse coefficients x and its residuals. As an

example of the speed involved, a simulation in MATLAB takes an average 0.03s to

process one test sample on a typical 3G Hz personal computer (PC). This example shows

that if the segmentation of the actions is known, and with no other invalid samples the

sensors can recognize the 12 actions individually with relatively high accuracy. Thus, the

mixture subspace model is a good approximation of the action data. The sparse

representation framework can provide a unified solution for recognizing and segmenting

valid actions, while rejecting invalid actions. Further, this approach is adaptive to the

change of available sensors on the fly.

Distributed Segmentation and Recognition

[58] A multi-resolution action segmentation can be introduced on each sensor node,



22

and an estimate of a range of possible lengths for all actions of interest can be obtained

from the training examples. This estimated range can be evenly divided into multiple

length hypotheses: (h1, …, hs). At each time t in a motion sequence, the node tests a set

of s possible segmentations: y(1) = (a(t-h1), …, a(t)), … y(s) = (a(t-hs), …, a(t)), as

shown in Figure 7. Figure 7 illustrates an example waveforms 700 for a multiple

segmentation hypotheses on a wrist sensor node at a given time (or number of samples on

the time domain) t = 150 of a “downstairs” sequence. A good segment is h1, while others

are false segments, and the movement between about t = 250 and about t = 350 represents

an outlying action that the subject performed.

[59] With each candidate, y may again be normalized to length l, and a sparse

representation x may be estimated using l1-minimization, as discussed above. Thus,

based on this sparsity assumption, if y is not a valid segmentation with respect to the

training examples due to either incorrect t or h, or the real action performed is not in the

training classes, the dominant coefficients of its sparsest representation x may not

correspond to any single class. As shown below in Equation 10, a sparsity concentration

index (SCI) can be used:

Equation 10: ].1,0[1

1/)(max)(

11,,1∈

−

−⋅=

=

K

xxKxSCI

jKj δΚ&

[60] If the nonzero coefficients of x are evenly distributed among K classes, then

SCI(x) = 0, while if all the nonzero coefficients are associated with a single class, then

SCI(x) = 1. Therefore, a sparsity threshold τ1 may be introduced and applied to all sensor

nodes, where if SCI(x) > τ1, the segment is a valid local measurement, and its 10-D LDA

features y~ can be sent to the base station. Figure 8 illustrates an example invalid

representation waveform 800, where SCI(x) = 0.13. In Figure 6 above, the action is

correctly classified as "Class 1," where SCI(x) = 0.7.

[61] A global classifier that adaptively optimizes the overall segmentation and



23

classification can also be introduced. For example, suppose at time t, and with a length

hypothesis h, the base station receives L' action features from the active sensors ( LL ≤' ).

For example, these features may be from the first L' sensors: '10

'1 )~,~('~ LTT

L

Tyyy ℜ∈= Κ .

Then the global sparse representation x satisfies the following linear system

Equation 11: xAAxRAx

R

R

y

L

'~

'

00

00

'~

'

1

==

=

ΛΛ

ΜΛΜΟΜ

ΚΚ

,

where DdLR

×ℜ∈ '' may be a new projection matrix that only extracts action features from

the first L' nodes. Consequently, an effect of changing active sensors for the global

classification may be formulated via global projection matrix R'. During this

transformation, data matrix A and sparse representation x may remain unchanged. The

linear system of Equation 6 then becomes a special case of Equation 11 when L'=1.

[62] Similar to the outlier rejection criterion on each sensor node in particular

embodiments, a global rejection threshold τ2 can be introduced. If SCI(x) > τ2 in

Equation 11, this is an indication that the most significant coefficients in x are

concentrated in a single training class. Hence y~ may be assigned to that class, and a

corresponding length hypothesis h may provide segmentation of the action from the

motion sequence.

[63] Thus in particular embodiments, an overall algorithm on the sensor nodes and

on the network server may provide a substantially unified solution to segment and

classify action segments from a motion sequence using two simple parameters τ1 and τ2.

Typically, τ1 may be selected to be less restricted than τ2 in order to increase a recall rate,

because passing certain amounts of a false signal to a global classifier may be rejected by

τ2 when the action features from multiple nodes are jointly considered. The formulation

of adaptive classification Equation 11 via a global projection matrix R' and two sparsity



24

constraints τ1 and τ2 provides a relatively simple means of rejecting outliers from a

network of multiple sensor nodes. This approach compares favorably to other classical

methods, such as kNN and decision trees, because these methods need to train multiple

thresholds and decision rules when the number L' and the set of available sensors vary in

the full-body action vector TT

L

Tyyy )~,~('~

'1 Λ= .

[64] Further, a change of active sensor nodes can affect l1-minimization and the

classification of the actions. In compressed sensing, the efficacy of l1-minimization in

solving for the sparsest solution x in Equation 11 is characterized by an l0/l

1 equivalence

relation. An example condition for the equivalence to hold is the k-neighborliness of '~A .

As a special case, it can be shown that if x is the sparsest solution in Equation 11 for L' =

L, x may also be a solution for L' < L. Thus, the decrease of L' may lead to sparser

solutions of x.

[65] On the other hand, a decrease in available action features may also make '~y less

discriminant. For example, if a reduction is made to L' = 1, and only a wrist sensor node

is activated, then the l1-solution x may have nonzero coefficients associated to multiple

actions with similar wrist motions, albeit sparser. This is an inherent problem in methods

of classifying human actions using a limited number of motion sensors. In theory, if two

action subspaces in a low-dimensional feature space have a small subspace distance after

the projection, the corresponding sparse representation cannot distinguish the test

samples from the two classes. As will be shown below, reducing the available sensors

can reduce the discriminant power of the sparse representation in a lower-dimensional

space.

[66] Figure 9 illustrates a flow diagram of basic steps in an example method 900 of

determining a motion using distributed sensors. The flow diagram is entered at (902).

Motion data can be acquired in one or more sensors positioned on a body (904). The

motion data can then be reduced to form compressed motion data (906). This



25

compressed motion data can then be subjected to LSC classification (908). The steps of

904-906 can be repeated, such as by utilizing TDMA, to receive compressed data from

multiple working sensor nodes, or to iteratively receive data from one or more sensor

nodes. At step 910 a check is made whether the LSC classification has resulted in a valid

motion. If so, step 912 is performed to call the DSC procedure to classify the motion

data. If not, sensor data acquisition is resumed.

[67] Outlier information can be removed from the received motion data. Motion

data with outlier information removed can then be compared to predetermined actions to

indicate movement of the body. At step 914 a check is made as to whether the motion

data can be verified as a valid motion. If so, output classification (918) occurs. Otherwise

the data is rejected (916), thus completing the flow (916).

[68] Note that the processing of the various steps illustrated in Figure 9 can be

performed at any suitable point in the system of Figure 19. In one embodiment,

(discussed in association with Figure 19), a managing entity can perform final

classification and other functions.

[69] In other embodiments different components in the system can be used to

perform various portions of the processing. For example, in a particular embodiment

described in Reference 4, no managing entity need be present. The managing software

can be located within 1930 of Figure 19, the mobile station per subject. Central body

controller 1930 can be used to perform management functions including classification

functions such as LSC classification 908 and DSC classification 912. An action database

or portions thereof, and other code and data can be stored on different components, e.g.,

at nodes such as body sensors 1920-1 to 5 and/or 1930 of Figure 19.

[70] Performance of the system may be validated using a data set collected from,

e.g., three male subjects at ages of 28, 30, and 32. In this particular experiment, 8



26

wearable sensor nodes were placed at different body locations, such as shown in Figure 1.

A set of 12 action classes was designed: (1) Stand-to-Sit (StSi); (2) Sit-to-Stand (SiSt);

(3) Sit-to-Lie (SiLi); (4) Lie-to-Sit (LiSi); (5) Stand-to-Kneel (StKn); (6) Kneel-to-Stand

(KnSt); (7) Rotate-Right (RoR); (8) Rotate-Left (RoL); (9) Bend; (10) Jump; (11)

Upstairs (Up); and (12) Downstairs (Down). This system was tested under various action

durations. Toward this end, the subjects were asked to perform StSi, SiSt, SiLi, and LiSi

with two different speeds (slow and fast), and to perform RoR and RoL with two

different rotation angles (90o and 180

o). The subjects were asked to perform a sequence

of related actions in each recording session based on their own interpretations of the

actions. There are 626 actions performed in the data set (see, e.g., Table 3 below for the

numbers in individual classes).

[71] Table 2 below shows precision versus recall of the algorithm with different

active sensor nodes. For these particular experiments, τ1 = 0.2 and τ2 = 0.4. When all

sensor nodes are activated, the algorithm can achieve about 98.8% accuracy among the

extracted actions, and 94.2% detection of the true actions. The performance may

decrease when more sensor nodes become unavailable to the global classifier.

Experimental results show that if one sensor node is maintained on the upper body (e.g.,

sensor 102-2 at position 2 in Figure 1) and one motion sensor node is maintained on the

lower body (e.g., sensor 102-7 at position 7 in Figure 1), the algorithm can still achieve

about 94.4% precision and 82.5% recall. Further, on average the 8 distributed classifiers

that reject invalid local measurements reduce the node-to-station communication by

about 50%.

Table 2: Precision versus recall with different sets of activated sensors

Sensors 2 7 2,7 1,2,7 1-3,7,8 1-8

Precision [%] 89.8 94.6 94.4 92.8 94.6 98.8

Recall [%] 65 61.5 82.5 80.6 89.5 94.2

[72] As to the relatively low recall on single sensor nodes (e.g., 102-2 and 102-7),



27

this is due to the relatively large number of potential outlying segments presented in a

long motion sequence (see, e.g., Figure 7). Also, the difference may be compared using

two “confusion” tables (see, e.g., Tables 3 and 4 below). As shown in these examples, a

single node (e.g., 102-2) that is positioned on a left wrist performed poorly mainly on two

action categories: Stand-to-Kneel and Upstairs-Downstairs, both of which involve

significant movements of the lower body, but not the upper body. This is one reason for

the low recall shown in Table 2 above. On the other hand, for the actions that are

detected using sensor node 102-2, the system can still achieve about 90% accuracy, thus

demonstrating the robustness of the distributed recognition framework.

Table 3: Confusion table using sensors 102-1 through 102-8

Class (total) 1 2 3 4 5 6 7 8 9 10 11 12

1: StSi (60) 60 0 0 0 0 0 0 0 0 0 0 0

2: SiSt (60) 0 52 0 0 0 0 0 0 0 0 0 0

3: SiLi (62) 1 0 58 0 0 0 0 0 0 0 0 0

4: LiSi (62) 0 0 0 60 0 0 0 0 0 0 0 0

5: Bend (30) 1 0 0 0 29 0 0 0 0 0 0 0

6: StKn (33) 0 0 0 0 0 31 0 0 0 0 0 0

7: KnSt (30) 0 0 0 0 0 0 30 0 0 0 1 0

8: RoR (95) 0 0 0 0 0 0 0 93 0 0 0 1

9: RoL (96) 0 0 0 0 0 0 0 0 96 0 0 0

10: Jump (34) 0 0 0 0 0 0 0 0 0 31 0 0

11: Up (33) 0 0 0 0 0 0 0 0 0 0 24 0

12: Down (31) 0 0 0 0 0 0 0 0 0 0 3 26

[73] Examples of classification results are shown to demonstrate algorithm accuracy

using all 8 sensor nodes (e.g., 102-1 through 102-8). Each of Figures 10-18 plots the

readings from x-axis accelerometers on the 8 sensor nodes. The segmentation results are

then superimposed. Indications of correctly classified action segment locations, as well

as false classification locations are shown, and some valid actions may not be detected by

the algorithm.

[74] Figure 10 illustrates example waveforms 1000 for an x-axis accelerometer

reading for a stand-sit-stand action. Figure 11 illustrates example waveforms 1100 for an



28

x-axis accelerometer reading for a sit-lie-sit action. Figure 12 illustrates example

waveforms 1200 for an x-axis accelerometer reading for a bend down action. Figure 13

illustrates example waveforms 1300 for an x-axis accelerometer reading for a kneel-

stand-kneel action. Figure 14 illustrates example waveforms 1400 for an x-axis

accelerometer reading for a turn clockwise then counter action. Figure 15 illustrates

example waveforms 1500 for an x-axis accelerometer reading for a turn clockwise 360o

action. Figure 16 illustrates example waveforms 1600 for an x-axis accelerometer

reading for a turn counter clockwise 360o action. Figure 17 illustrates example

waveforms 1700 for an x-axis accelerometer reading for a jump action. Figure 18

illustrates example waveforms 1800 for an x-axis accelerometer reading for a go upstairs

action.

Table 4: Confusion table using sensor 102-2

Class (total) 1 2 3 4 5 6 7 8 9 10 11 12

1: StSi (60) 37 0 2 0 0 0 0 4 0 0 0 0

2: SiSt (60) 0 50 0 0 0 0 0 0 2 0 0 0

3: SiLi (62) 1 0 38 0 0 0 0 0 0 0 0 0

4: LiSi (62) 0 7 0 32 0 0 0 0 0 0 0 0

5: Bend (30) 0 1 0 0 26 0 0 0 0 0 0 0

6: StKn (33) 0 1 0 1 0 7 0 2 3 0 0 0

7: KnSt (30) 0 1 0 0 1 0 6 3 3 0 0 0

8: RoR (95) 0 0 0 0 0 0 0 92 0 0 0 0

9: RoL (96) 0 0 0 0 0 0 0 0 95 0 0 0

10: Jump (34) 0 0 0 0 0 0 0 0 1 24 0 0

11: Up (33) 0 0 0 0 0 0 0 1 8 0 0 0

12: Down (31) 0 0 0 0 0 0 1 0 3 0 0 0

Cell Phone Example

[75] In one experiment, a cellular phone that incorporated a three axis accelerometer

(Apple iPhone®) was utilized as a sensor node for human action classification. A

software application was loaded on the iPhone that enabled streaming of three axis



29

accelerometer data via a Wi-Fi (IEEE 802.11 protocol) data link to a PC. Five (5)

subjects were studied. Wearing the iPhone attached to a lanyard worn around the neck,

subjects performed a series of six action categories: (1) stand-to-sit, (2) sit-to-stand, (3)

walk, (4) upstairs, (5) downstairs and (6) stand still. A subset of the accelerometer data

collected was hand segmented to create training examples for each human action class.

Continuous, non segmented accelerometer data, was processed using LSC and the

predicted action class was compared to the known actions recorded during the test.

Human action classification accuracy of 86% was achieved using LDA projection of two

second data segments.

Conclusion

[76] Building on emerging compressed sensing theory, particular embodiments

include a distributed algorithm approach for segmenting and classifying human actions

using a wearable motion sensor network. For example, a framework provides a unified

solution based on l1-minimization to classify valid action segments and reject outlying

actions on the sensors and the base station. The example experiments show that a set of

12 action classes can be accurately represented and classified using a set of 10-D LDA

features measured at multiple body locations. Further, the proposed global classifier can

adaptively adjust the global optimization to boost the recognition upon available local

measurements.

[77] Further details are shown in the papers included with this application. For

example, a design description of an example implementation of an LSC procedure is

illustrated at page 6 of Reference 1. A design description of a DSC procedure is shown at

page 8 of Reference 1.

[78] Although the description has been described with respect to particular

embodiments thereof, these particular embodiments are merely illustrative, and not

restrictive. For example, particular embodiments may also be applied to classify a broad



30

body of biological signals, such as electrocardiogram signals, respiratory patterns, and

waveforms of brain and/or other organ activities, via respective biological sensors. Such

actions or motions may be by humans or other biological animals and functionality

described herein may even be applied for motion by mechanical entities such as robots or

other machines.

[79] Other applications may benefit from aspects or embodiments of the invention.

For example, multiple sensors could be placed at designated positions on manufacturing

equipment and by analyzing the data captured by each sensor, the “health” of the

production line could be determined. Another example is a power management

application where local conditions of power generation, consumption, loss and other

factors are monitored and used to adjust the performance of the power system,

performing local classification and/or data filtering and aggregation may reduce the

amount of data traffic, reduce complexity of analysis, improve speed or efficiency of the

system, or provide other benefits. In general, any system that uses distributed data-

sensing and data relay may benefit from features described herein.

[80] Higher classification accuracy may be realized using features described herein.

For example, as shown in the accompanying References, a higher human action

classification accuracy may be realized such as in the range 95% to 99% versus 85% to

90% for other approaches. False alarms nay be reduced, which has been a significant

problem plaguing other systems.

[81] Power consumption may be reduced by requiring less data transmission

between nodes and base station since only outlier classifications require transmission and

global consideration. Thus, continuous streaming of data need not be required.

[82] The system accuracy can degrade gracefully with the loss of sensor nodes. The

system design is more easily scalable as additional nodes can be added to improve

performance or add functionality.



31

[83] Embodiments described herein can support the expansion or introduction of

new training data to help the system to adapt or learn as it is used. This can be important

since most previous systems have been "one-size-fits-all" and as a consequence accuracy

and false alarms have been an issue.

[84] In one embodiment, the approach requires only two parameters to optimize

system performance - outlier threshold at the local classifier and at the global level. Other

approaches can require significant tuning to obtain good performance. Embodiments can

use the same or similar classifier algorithms at the mote (i.e., node, sensor or body) level

and at the global level, simplifying the system development.

[85] Any suitable programming language can be used to implement the routines of

particular embodiments including C, C++, Java, assembly language, etc. Different

programming techniques can be employed such as procedural or object oriented. The

routines can execute on a single processing device or multiple processors. Although the

steps, operations, or computations may be presented in a specific order, this order may be

changed in different particular embodiments. In some particular embodiments, multiple

steps shown as sequential in this specification can be performed at the same time.

[86] Particular embodiments may be implemented in a computer-readable storage

medium for use by or in connection with the instruction execution system, apparatus,

system, or device. Particular embodiments can be implemented in the form of control

logic in software or hardware or a combination of both. The control logic, when

executed by one or more processors, may be operable to perform that which is described

in particular embodiments.

[87] Particular embodiments may be implemented by using a programmed general

purpose digital computer, by using application specific integrated circuits, programmable

logic devices, field programmable gate arrays, optical, chemical, biological, quantum or

nanoengineered systems, components and mechanisms may be used. In general, the



32

functions of particular embodiments can be achieved by any means as is known in the art.

Distributed, networked systems, components, and/or circuits can be used.

Communication, or transfer, of data may be wired, wireless, or by any other means.

[88] It will also be appreciated that one or more of the elements depicted in the

drawings/figures can also be implemented in a more separated or integrated manner, or

even removed or rendered as inoperable in certain cases, as is useful in accordance with a

particular application. It is also within the spirit and scope to implement a program or

code that can be stored in a machine-readable medium to permit a computer to perform

any of the methods described above.

[89] As used in the description herein and throughout the claims that follow, “a”,

“an”, and “the” includes plural references unless the context clearly dictates otherwise.

Also, as used in the description herein and throughout the claims that follow, the meaning

of “in” includes “in” and “on” unless the context clearly dictates otherwise.

[90] Thus, while particular embodiments have been described herein, latitudes of

modification, various changes, and substitutions are intended in the foregoing

disclosures, and it will be appreciated that in some instances some features of particular

embodiments will be employed without a corresponding use of other features without

departing from the scope and spirit as set forth. Therefore, many modifications may be

made to adapt a particular situation or material to the essential scope and spirit.



33

We Claim:

1. A method for obtaining data in a distributed sensor system, wherein

aggregated sensor data is used to achieve a result, the method comprising:

obtaining sensor data from first and second sensors at a local site;

using local processing to perform at least a portion of first classification of

the sensor data.

2. The method of claim 1, wherein the first classification includes a

Distributed Sparsity Classifier (DSC) function.

3. The method of claim 2, wherein the first classification includes a Local

Sparsity Classififier (LSC) function.

4. The method of claim 3, wherein the distributed sensor system includes

nodes, wherein each node comprises a particular sensor and resources associated with the

particular sensor, the method further comprising:

performing the LSC function at one or more nodes.

5. The method of claim 1, wherein the distributed sensor system is used in

human body action detection.

6. The method of claim 1, further comprising:

using additional processing to perform at least a portion of second

classification of the sensor data.

7. The method of claim 6, wherein the additional processing is performed

at least in part by a managing entity at a location remote from where the local processing



34

is performed.

8. The method of claim 1, further comprising:

determining that a performance of a sensor has changed; and

adapting a classification of data in response to the determining.

9. An apparatus for obtaining data in a distributed sensor system, wherein

aggregated sensor data is used to achieve a result, the apparatus comprising:

a processor;

a processor-readable medium including one or more instructions for:

obtaining sensor data from first and second sensors at a local site; and


the sensor data.

10. A processor-readable medium including instructions executable by a

processor for obtaining data in a distributed sensor system, wherein aggregated sensor

data is used to achieve a result, the processor-readable medium comprising one or more

instructions for:

obtaining sensor data from first and second sensors at a local site; and


the sensor data.



35


Abstract

An approach for determining motions of a body using distributed sensors is

disclosed. In one embodiment, an apparatus can include: a plurality of sensors coupled to

a body, where each sensor is positioned at about a designated location on the body, and

where each sensor is configured to acquire motion data related to movement of the

designated location on the body and at which the sensor is positioned, and to reduce the

motion data into compressed and transmittable motion data; and a base station configured

to receive the compressed motion data via wireless communication from at least one of

the plurality of sensors, the base station being further configured to remove outlier

information from the received motion data, and to match the received motion data to a

predetermined action, where the predetermined action indicates a movement of the body.

100

Figure 1

102-1

102-2

102-3

102-4

102-6

102-7

102-8

102-5

200

Figure 2

Accelerometer

210

Gyroscope

204

Controller

206

Transmitter

208

Sensor Node

102

Sensor Node

102-1

Sensor Node

102-2

Sensor Node

102-N

Located on a body for

motion sensing

Base Station

202

300

Figure 3

400

Figure 4

500

Figure 5

600

Figure 6

700

Figure 7

800

Figure 8

900

Figure 9

End

916

Call LSC to Classify Motion Data

908

Acquire motion data in a sensor

node positioned on a body

904

Reduce the motion data to form

compressed motion data

906

Is Motion

Valid?

910

Y

N

Start

902

Call DSC to Classyfy Motion Data

912

Is Data Valid?

914

Y

Output Classification

918

Reject Data

916

N

1000

Figure 10

1100

Figure 11

1200

Figure 12

1300

Figure 13

1400

Figure 14

1500

Figure 15

1600

Figure 16

1700

Figure 17

1800

Figure 18

INTERNET

MANAGING ENTITY

Figure 19

1950

1980 1970

1960

1994

1990

1992

1920-3 1920-2

1920-3

1920-5

1920-4

1910

19301940

PTO/SB/08a (04-09)




Doc code: IDS

Doc description: Information Disclosure Statement (IDS) Filed

INFORMATION DISCLOSURE

STATEMENT BY APPLICANT ( Not for submission under 37 CFR 1.99)

Application Number

Filing Date

First Named Inventor

Art Unit

Examiner Name

Attorney Docket Number

EFS Web 2.1.13

U.S.PATENTS

Examiner

Initial*

Cite

NoPatent Number

Kind

Code1Issue Date

Name of Patentee or Applicant

of cited Document

Pages,Columns,Lines where

Relevant Passages or Relevant

Figures Appear

If you wish to add additional U.S. Patent citation information please click the Add button.

U.S.PATENT APPLICATION PUBLICATIONS

Examiner

Initial*

Cite

NoPublication Number

Kind

Code1

Publication

Date

Name of Patentee or Applicant

of cited Document

Pages,Columns,Lines where

Relevant Passages or Relevant

Figures Appear

If you wish to add additional U.S. Published Application citation information please click the Add button.

FOREIGN PATENT DOCUMENTS

Examiner

Initial*

Cite

No

Foreign Document

Number3

Country

Code2

Kind

Code4

Publication

Date

Name of Patentee or

Applicant of cited

Document

Pages,Columns,Lines

where Relevant

Passages or Relevant

Figures Appear

T5

If you wish to add additional Foreign Patent Document citation information please click the Add button

NON-PATENT LITERATURE DOCUMENTS

Examiner

Initials*

Cite

No

Include name of the author (in CAPITAL LETTERS), title of the article (when appropriate), title of the item

(book, magazine, journal, serial, symposium, catalog, etc), date, pages(s), volume-issue number(s),

publisher, city and/or country where published.

T5

BAJCSY et al.

010030-002710US

1

1

i

1



Application Number

Filing Date


Art Unit

Examiner Name


EFS Web 2.1.13

If you wish to add additional non-patent literature document citation information please click the Add button

EXAMINER SIGNATURE

Examiner Signature Date Considered

*EXAMINER: Initial if reference considered, whether or not citation is in conformance with MPEP 609. Draw line through a

citation if not in conformance and not considered. Include copy of this form with next communication to applicant.

1 See Kind Codes of USPTO Patent Documents at www.USPTO.GOV or MPEP 901.04. 2 Enter office that issued the document, by the two-letter code (WIPO

Standard ST.3). 3 For Japanese patent documents, the indication of the year of the reign of the Emperor must precede the serial number of the patent document. 4 Kind of document by the appropriate symbols as indicated on the document under WIPO Standard ST.16 if possible. 5 Applicant is to place a check mark here if

English language translation is attached.

BAJCSY et al.

010030-002710US

1YANG et al. "DISTRIBUTED SEGMENTATION AND CLASSIFICATION OF HUMAN ACTION USING A WEARABLE

MOTION SENSOR NETWORK", pages 1-8

2YANG et al. "DISTRIBUTED RECOGNITION OF HUMAN ACTIONS USING WEARABLE MOTION SENSOR

NETWORKS" Journal of Ambient Intelligence and Smart Enviromnets 1 (2009) 1-5 IOS Press, pages 1-13

3

YANG et al.."DISTRIBUTED SEGMENTATION AND CLASSIFICATION OF HUMAN ACTIONS USING A WEARABLE

MOTION SENSOR NETWORK" Electrical Engineering & Computer Sciences, University of CA @ Berkeley, Dec. 6,

2007 pages 1-19

4KURYLOSKI et al. "DEXTERNET: AN OPEN PLATFORM FOR HETEROGENEOUS BODY SENSOR NETWORKS

AND ITS APPLCAITONS" IPSN'09 San Francisco, CA USA pages 11-11



Application Number

Filing Date


Art Unit

Examiner Name


EFS Web 2.1.13

CERTIFICATION STATEMENT

Please see 37 CFR 1.97 and 1.98 to make the appropriate selection(s):

That each item of information contained in the information disclosure statement was first cited in any communication

from a foreign patent office in a counterpart foreign application not more than three months prior to the filing of the

information disclosure statement. See 37 CFR 1.97(e)(1).

OR

That no item of information contained in the information disclosure statement was cited in a communication from a

foreign patent office in a counterpart foreign application, and, to the knowledge of the person signing the certification

after making reasonable inquiry, no item of information contained in the information disclosure statement was known to

any individual designated in 37 CFR 1.56(c) more than three months prior to the filing of the information disclosure

statement. See 37 CFR 1.97(e)(2).

See attached certification statement.

Fee set forth in 37 CFR 1.17 (p) has been submitted herewith.

None

SIGNATURE

A signature of the applicant or representative is required in accordance with CFR 1.33, 10.18. Please see CFR 1.4(d) for the

form of the signature.

Signature Date (YYYY-MM-DD)

Name/Print Registration Number

This collection of information is required by 37 CFR 1.97 and 1.98. The information is required to obtain or retain a benefit by the

public which is to file (and by the USPTO to process) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR

1.14. This collection is estimated to take 1 hour to complete, including gathering, preparing and submitting the completed

application form to the USPTO. Time will vary depending upon the individual case. Any comments on the amount of time you

require to complete this form and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S.

Patent and Trademark Office, U.S. Department of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND

FEES OR COMPLETED FORMS TO THIS ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria,

VA 22313-1450.

BAJCSY et al.

010030-002710US

/Charles J. Kulas/ 2009-12-03

Charles J. Kulas 35809

Privacy Act Statement

EFS Web 2.1.13

The Privacy Act of 1974 (P.L. 93-579) requires that you be given certain information in connection with your submission of the

attached form related to a patent application or patent. Accordingly, pursuant to the requirements of the Act, please be advised

that: (1) the general authority for the collection of this information is 35 U.S.C. 2(b)(2); (2) furnishing of the information solicited

is voluntary; and (3) the principal purpose for which the information is used by the U.S. Patent and Trademark Office is to

process and/or examine your submission related to a patent application or patent. If you do not furnish the requested

information, the U.S. Patent and Trademark Office may not be able to process and/or examine your submission, which may

result in termination of proceedings or abandonment of the application or expiration of the patent.

The information provided by you in this form will be subject to the following routine uses:

1. The information on this form will be treated confidentially to the extent allowed under the Freedom of Information Act

(5 U.S.C. 552) and the Privacy Act (5 U.S.C. 552a). Records from this system of records may be disclosed to the

Department of Justice to determine whether the Freedom of Information Act requires disclosure of these record s.

2. A record from this system of records may be disclosed, as a routine use, in the course of presenting evidence to a

court, magistrate, or administrative tribunal, including disclosures to opposing counsel in the course of settlement

negotiations.

3. A record in this system of records may be disclosed, as a routine use, to a Member of Congress submitting a

request involving an individual, to whom the record pertains, when the individual has requested assistance from the

Member with respect to the subject matter of the record.

4. A record in this system of records may be disclosed, as a routine use, to a contractor of the Agency having need for

the information in order to perform a contract. Recipients of information shall be required to comply with the

requirements of the Privacy Act of 1974, as amended, pursuant to 5 U.S.C. 552a(m).

5. A record related to an International Application filed under the Patent Cooperation Treaty in this system of records

may be disclosed, as a routine use, to the International Bureau of the World Intellectual Property Organization, pursuant

to the Patent Cooperation Treaty.

6. A record in this system of records may be disclosed, as a routine use, to another federal agency for purposes of

National Security review (35 U.S.C. 181) and for review pursuant to the Atomic Energy Act (42 U.S.C. 218(c)).

7. A record from this system of records may be disclosed, as a routine use, to the Administrator, General Services, or

his/her designee, during an inspection of records conducted by GSA as part of that agency's responsibility to

recommend improvements in records management practices and programs, under authority of 44 U.S.C. 2904 and

2906. Such disclosure shall be made in accordance with the GSA regulations governing inspection of records for this

purpose, and any other relevant (i.e., GSA or Commerce) directive. Such disclosure shall not be used to make

determinations about individuals.

8. A record from this system of records may be disclosed, as a routine use, to the public after either publication of the

application pursuant to 35 U.S.C. 122(b) or issuance of a patent pursuant to 35 U.S.C. 151. Further, a record may be

disclosed, subject to the limitations of 37 CFR 1.14, as a routine use, to the public if the record was filed in an application

which became abandoned or in which the proceedings were terminated and which application is referenced by either a

published application, an application open to public inspections or an issued patent.

9. A record from this system of records may be disclosed, as a routine use, to a Federal, State, or local law

enforcement agency, if the USPTO becomes aware of a violation or potential violation of law or regulation.

Distributed Segmentation and Classification of Human

Actions Using a Wearable Motion Sensor Network

Allen YangRoozbeh JafariPhilip KuryloskiSameer IyengarS. Shankar SastryRuzena Bajcsy

Electrical Engineering and Computer SciencesUniversity of California at Berkeley

Technical Report No. UCB/EECS-2007-143

http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-143.html

December 6, 2007

Copyright © 2007, by the author(s).All rights reserved.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission.

Acknowledgement

Yang and Sastry are partially supported by ARO MURI W911NF-06-1-0076. Jafari is partially supported by the startup fund from the University ofTexas and Texas Instruments. Bajcsy is partially supported by NSF IIS0724682. Kuryloski, Iyengar, Sastry, and Bajcsy are partially supported byTRUST (Team for Research in Ubiquitous Secure Technology), whichreceives support from NSF CCF-0424422, AFOSR FA9550-06-1-0244, andthe following organizations: Cisco, British Telecom, ESCHER, HP, IBM,iCAST, Intel, Microsoft, ORNL, Pirelli, Qualcomm, Sun, Symantec, TelecomItalia, and United Technologies.

1

Distributed Segmentation and Classification of

Human Actions

Using a Wearable Motion Sensor NetworkAllen Y. Yang, Roozbeh Jafari, Philip J. Kuryloski, Sameer Iyengar, S. Shankar Sastry, and Ruzena Bajcsy

Abstract

We propose a distributed recognition framework to classify human actions using a wearable motion sensor network. Eachsensor node consists of an integrated triaxial accelerometer and biaxial gyroscope. Given a set of pre-segmented actions as trainingexamples, the algorithm simultaneously segments and classifies human actions from a motion sequence, and it also rejects unknownactions that are not in the training set. The classification is distributedly operated on individual sensor nodes and a base stationcomputer. Due to rapid advances in the integration of mobile processors and heterogeneous sensors, a distributed recognitionsystem likely outperforms traditional centralized recognition methods. In this paper, we assume the distribution of multiple actionclasses satisfies a mixture subspace model, one subspace for each action class. Given a new test sample, we seek the sparsestlinear representation of the sample w.r.t. all training examples. We show that the dominant coefficients in the representation onlycorrespond to the action class of the test sample, and hence its membership is encoded in the representation. We provide fastlinear solvers to compute such representation via `

1-minimization.

I. INTRODUCTION

In this paper, we consider human action recognition on a distributed wearable motion sensor network. Each sensor node is

integrated with a triaxial accelerometer and biaxial gyroscope. The locations of the sensors are roughly defined to be the waist,

two wrists, left arm, two knees, and two ankles, as shown in Fig 1. Action recognition has been studied to a great extent in

computer vision in the past. Compared to a model-based or appearance-based vision system, the body sensor network approach

has the following advantages: 1. The system does not require to instrument the environment with cameras or other sensors.

2. The system has the necessary mobility to support continuous monitoring of a subject during her daily activities. 3. With

the continuing integration of mobile processors, sensors, and batteries, it has become possible to manufacture wearable sensor

networks that densely cover the human body to record and analyze very small movements of the human body (e.g., breathing

and spine movements). Such sensor networks can be used in applications such as medical-care oriented surveillance, athletic

training, tele-immersion, and human-computer interaction.

Fig. 1. A distributed wearable sensor network. The sensor on the right arm was malfunctioned during the experiment.

Yang, Iyengar, Sastry, and Bajcsy are with the Department of Electrical Engineering and Computer Science, University of California, Berkeley. Jafari is withthe Department of Electrical Engineering, University of Taxes at Dallas. Kuryloski is with the Department of Electrical Engineering and Computer Science,University of California, Berkeley, and the Department of Electrical and Computer Engineering, Cornell University. Corresponding author: Allen Y. Yang,Rm 307 Cory Hall, UC Berkeley, Berkeley, CA 94720. Email: [email protected]. Tel: 510-643-5798. Fax: 510-643-2356.

Yang and Sastry are partially supported by ARO MURI W911NF-06-1-0076. Jafari is partially supported by the startup fund from the University of Texasand Texas Instruments. Bajcsy is partially supported by NSF IIS 0724682. Kuryloski, Iyengar, Sastry, and Bajcsy are partially supported by TRUST (Team forResearch in Ubiquitous Secure Technology), which receives support from NSF CCF-0424422, AFOSR FA9550-06-1-0244, and the following organizations:Cisco, British Telecom, ESCHER, HP, IBM, iCAST, Intel, Microsoft, ORNL, Pirelli, Qualcomm, Sun, Symantec, Telecom Italia, and United Technologies.

2

In traditional sensor networks, the computation carried by the sensor board is fairly simple: Extract certain local information

and transmit the data to a computer server over the network for processing. With recent advances in power-efficient mobile

processors for sensor networks (e.g., FPGA and Intel XScale series), we are interested in studying new frameworks for

distributed pattern recognition. In such systems, each sensor node will be able to classify local, albeit biased, information.

Only when the local classification detects a possible object/event does the sensor node becomes active and transmit the

measurement to the server. On the server side, a global classifier receives data from the sensor nodes and further optimizes the

classification. The global classifier can be more computationally involved than the distributed classifiers, but it has to adapt to

the change of available active sensors due to local measurement error, sensor failure, and communication congestion.

Distributed pattern recognition on sensor networks has several advantages: 1. Good decisions about the validity of the local

information can reduce the communication between the nodes and the server, and therefore reduce power consumption. Previous

studies have shown the power consumption required to send one byte over a wireless network is equivalent to executing between

1e3 and 1e6 instructions on an onboard processor [21]. 2. The framework increases the robustness of action recognition on the

network. Particularly, as we will show later, one can choose to activate some or all of the sensor nodes on the fly, and the global

classifier is able to adaptively adjust the optimization process and improve the recognition upon local decisions. 3. The ability

for the sensor nodes to make biased local decisions also makes the design of the global classifier more flexible. For example,

a system that only monitors abnormal movements (e.g., falling or no movement) can make fairly good estimation using local

decisions and discard the global optimization, and in cases that the central system fails, the network can still support limited

recognition tasks using the distributed classifiers. 4. Finally, in a more general perspective beyond action recognition, the ability

for individual sensor nodes to make local decisions can be used as feedback to support certain autonomous actions/reactions

without relying on the intervention of a central system.

We define distributed action recognition as follows:

Problem 1 (Distributed segmentation and classification): Assume a set of L wearable sensor nodes with integrated triaxial

accelerometers and biaxial gyroscopes are attached to multiple locations of the human body. Denote

al(t) = (xl(t), yl(t), zl(t), θl(t), ρl(t))T ∈ R

5

as the measurement of the five sensors on node l at time t, and

a(t) = (aT1 (t),aT

2 (t), · · · ,aTL(t))T ∈ R

5L

collects all sensor measurement. Denote

s = (a(1),a(2), · · · ,a(l)) ∈ R5L×l

as an action sequence of length l.Given K different classes of human actions, a set of ni training examples {si,1, · · · , si,ni

} are collected for each ith class.

The durations of the sequences naturally may be different. Given a new test sequence s that may contain multiple actions

and possible other outlying actions, we seek a distributed algorithm to simultaneously segment the sequence and classify the

actions.

Solving this problem mainly involves the following difficulties.

1) Simultaneous segmentation and classification. If the test sequence is pre-segmented, classification becomes straightforward

with many classical algorithms to choose from. In this paper, we seek simultaneous segmentation and recognition from

a long motion sequence. Furthermore, we also assume that the test sequence may contain other unknown actions that

are not from the K classes. The algorithm needs to be robust to these outliers.

2) Variation of action durations. One major difficulty in action recognition is to determine the duration of an action. Good

classification depends on correct estimation of both the starting time and the duration of an action. But in practice, the

durations of different actions may vary dramatically (see Fig 2).

Fig. 2. Population of different action durations in our data set.

3

3) Identity independence. In addition to the variation of action durations, different people act differently for the same actions

(see Fig 3). If both the training samples and the test samples are from the same subject, typically the classification could

be greatly simplified. However, it is well known that collecting large numbers of training samples in human biometrics is

expensive, particularly in medical-care oriented applications. Therefore it is desirable for an action recognition algorithm

to be identity independent. For a test sequence in the experiment, we examine the identity-independent performance by

excluding the training samples of the same subject.

Fig. 3. Readings of the x-axis accelerometers (top) and x-axis gyroscopes (bottom) from 8 distributed sensors (shown in different colors)on two repetitive “stand-kneel-stand” sequences from two subjects as the left and right columns.

4) Distributed recognition. A distributed recognition system needs to further consider the following issues: 1. How to

extract compact and accurate low-dimensional action features for local classification and transmission over a band-

limited network? 2. How to classify the local measurement in real time using low-power processors? 3. How to design

a classifier to globally optimize the recognition and be adaptive to the change of the network?

a) Literature Overview.: Action (or activity) recognition using wearable motion sensors has been a prominent topic in

the last five years. Initial studies were primarily focused on single accelerometers [9], [11] or other motion sensors [12], [19].

More recent systems prefer using multiple motion sensors [1], [2], [10], [13], [16], [17], [20]. Depending on the type of sensor

used, an action recognition system is typically composed of two parts: a feature extraction module and a classification module.

There are three major directions for feature extraction in wearable sensor networks. The first direction uses simple statistics

of a signal sequence such as the max, mean, variance, and energy [2], [10], [11], [13], [20]. The second type of feature is

computed using fixed filter banks such as FFT and wavelets [11], [19]. The third type is based on classical dimensionality

reduction techniques such as principal component analysis (PCA) and linear discriminant analysis (LDA) [16], [17]. In terms

of classification on the action features, a large body of previous work favored thresholding or k-nearest-neighbor (kNN) due

to the simplicity of the algorithms implemented on mobile devices [11], [19], [20]. Other more sophisticated techniques have

also been used, such as decision trees [2], [3] and hidden Markov models [16].

For distributed pattern recognition, there exist initial studies on distributed speech recognition [23] and distributed expert

systems [18]. In [23], the authors summarized three major categories of distributed recognition:1 1. All data are relayed to a

computer server for processing, e.g., on a closed-circuit camera system [14]. 2. All data are locally processed, e.g., [15]. One

may further choose to implement a global classifier by a majority-voting scheme on local decisions. 3. A full-fledged distributed

recognition system consists of both front-end processing for feature extraction and global processing for classification [6], [13],

[16], [17], [20]. Our distributed action recognition system falls into the last category. One particular problem associated with

this category is that each local observation from the distributed sensors is biased and may be insufficient to classify all classes.

For example in our system, the sensors placed on the lower-body would not perform well to classify those actions that mainly

involve upper body motions. Consequently, one can not expect majority-voting type classifiers to perform well globally.

b) Contributions of the paper.: We propose a distributed action recognition algorithm that simultaneously segments and

classifies 12 human actions using 1- 8 wearable motion sensor nodes. We assume the wearable sensor network is a typical one-

hop wireless network and all the sensor nodes communicate with a central computer. The work is inspired by a recent study on

face recognition using sparse representation and `1-minimization [22]. We assume each action class satisfies a low-dimensional

subspace model. We show that a 10-D LDA feature space suffices to locally represent the 12 action subspaces on each node.

If a linear representation is sought to represent a valid test sample w.r.t. all training samples, the dominant coefficients in the

sparsest representation correspond to the training samples from the same action class, and hence they encode the membership

of the test sample. We further study fast linear programming routines to solve for such sparse representation.

We investigate a distributed framework for simultaneous segmentation and classification of individual actions from a motion

sequence. On each sensor node, a classifier searches for good segmentation on multiple temporal resolutions. We propose an

effective method to reject action segments that do not correspond to any training class as outliers. Hence an inlying action

segment simultaneously provides the localization of the action and its membership.

1In certain situations it is desirable to consider a complete distributed recognition system where there is no central system and the recognition on the nodesconverge over time via node-to-node communications. In this paper, having a base station is still a practical and efficient solution.

4

When a sensor node detects a valid action segment, it transmits its 10-D feature to the server. The global classifier receives

the distributed feature vectors, and then seeks a global sparse representation of the action features against the corresponding

feature vectors of all the training samples. The global optimization is adaptive to the change of available active nodes.

The focus of this paper is about the distributed action recognition framework. The algorithm is software simulated in

MATLAB. Currently our data set is mainly designed for transient actions (e.g., jumping, kneeling, and stand-to-sit), but it

also contains a limited number of nontransient actions (i.e., turning, going upstairs and downstairs). We are in the process of

gradually expanding the number of subjects and action classes in the database.

II. DESIGN OF THE WEARABLE SENSOR NETWORK

The wearable sensor network consists of sensor nodes placed at various body locations, which communicate with a base

station attached to a computer server through a USB port. The sensor nodes and base station are built using the commercially

available Tmote Sky boards. Tmote Sky runs TinyOS on an 8MHz microcontroller with 10K RAM and communicates using

the 802.15.4 wireless protocol. Each custom-built sensor board has a triaxial accelerometer and a biaxial gyroscope, which is

attached to Tmote Sky (shown in Fig 4). Each axis is reported as a 12bit value to the node, indicating values in the range of

±2g and ±500◦/s for the accelerometer and gyroscope, respectively. Each node is currently powered by two AA batteries.

Fig. 4. The sensor board with the accelerometer and gyroscope. The mother board at the back is Tmote Sky.

The current hardware design of the sensor contributes certain amounts of measurement error. The accelerometers typically

require some calibration in the form of a linear correction, as sensor output under 1g may be shifted up to 15% in some

sensors. It is also worth noting that the gyroscopes produce an indication of rotation under straight line motions. Fortunately

these systematic errors appear to be consistent across experiments for a given sensor board. However, without calibration to

correct them, the errors may affect the action recognition if different sets of sensors are used interchangeably in the experiment.

To avoid packet collision in the network, we use a TDMA protocol that allocates each node a specific time slot during

which to transmit data. This allows us to receive sensor data at 20Hz with minimal packet loss. To avoid drift in the network,

the base station periodically broadcasts a packet to resynchronize the nodes’ individual timers. The code to interface with the

sensors and transmit data is implemented directly on the mote using nesC, a variant of C.

III. CLASSIFICATION VIA SPARSE REPRESENTATION

In this section, we present an efficient action classification method to recognize pre-segmented action sequences on each

sensor node via `1-minimization. We first discuss the representation of action samples in vector form. Given an action segment

of length l from node j, sj = (aj(1),aj(2), · · · ,aj(l)) ∈ R5×l, define a new vector s

Sj as the stacking of the l columns of

sj :

sSj

.= (ai(1)T ,ai(2)T , · · · ,ai(l)

T )T ∈ R5·l. (1)

We will interchangeably use sj and sSj to denote the stacked vector without causing ambiguity.

Since the length l varies among different subjects and actions, we need to normalize l to be the same for all the training and

test samples, which can be achieved by linear interpolation or FFT interpolation. After normalization, we denote the dimension

of samples sj as Dj = 5l. Subsequently, we define a new vector v that stacks the measurement from all L nodes:

v = (sT1 , sT

2 , · · · , sTL)T ∈ R

D, (2)

where D = D1 + · · · + DL = 5lL.

5

In this paper, we assume the samples v in an action class satisfy a subspace model, called an action subspace. If the training

samples {v1, · · · ,vni} of the ith class sufficiently span the ith action subspace, given a test sample y = (yT

1 , · · · ,yTL)T ∈ R

D

in the same class i, y can be linearly represented using the training examples of the same class:

y = α1v1 + · · · + αnivni

⇔

y1

y2...

yL

=

s1

s2

...

sL

1

· · ·

s1

s2

...

sL

ni

α1

α2

...

αni

.(3)

It is important to note that such linear constraint also holds on each node j: yj = α1sj,1 + · · · + αnisj,ni

∈ RDj .

In theory, complex data such as human actions typically constitute complex nonlinear models. The linear models are used

to approximate such nonlinear structures in a higher-dimensional subspace (see Fig 5). Notice that such linear approximation

may not produce good estimation of the distance/similarity metric for the samples on the manifold. However, as we will show

in Example 1, given sufficient samples on the manifold as training examples, a new test sample can be accurately represented

on the subspace, provided that any two classes do not have similar subspace models.

Fig. 5. Modeling a 1-D manifold M using a 2-D subspace V .

In this paper, we are interested in recovering label(y). A previous study [22] proposed to reformulate the recognition using

a global sparse representation: Since label(y) = i is unknown, we can represent y using all the training samples from Kclasses.

y = (A1, A2, · · · , AK)

x1

x2

...

xK

= Ax, (4)

where Ai = (vi,1,vi,2, · · · ,vi,ni) ∈ R

D×ni collects all the training samples of class i, xi = (αi,1, αi,2, · · · , αi,ni)T ∈ R

ni

collects the corresponding coefficients in (3), and A ∈ RD×n where n = n1 + n2 + · · · + nK .

Since y satisfies both (3) and (4), one solution of x in (4) should be

x∗ = (0, · · · , 0,xT

i , 0, · · · , 0)T . (5)

The solution is naturally sparse: in average only 1K

terms in x∗ are nonzero. Furthermore, x

∗ is also a solution for the

representation on each node j:

yj = (A(j)1 , A

(j)2 , · · · , A

(j)K ) · x = A(j)

x, (6)

where A(j)i ∈ R

Dj×ni consists of row vectors in Ai that correspond to the jth node. Hence, x∗ can be solved either globally

using (4) or locally using (6), provided that the action data measured on each node are sufficiently discriminant. We will come

back to the discussion about local classification versus global classification in Section IV. In the rest of this section however,

our focus will be on each node.

One major difficulty in solving (6) is the high dimensionality of the action data. For example, in this paper, we normalize

l = 64 for all action segments (see Fig 2 for the distribution of original lengths). Then Dj = 64 × 5 = 320 for yj on each

node. The high dimensionality makes it difficult to either directly solve for x on the node or transmit the action data over a

band-limited wireless channel. In compressed sensing [4], [5], one reduces the dimension of a linear system by choosing a

linear projection Rj ∈ Rd×Dj :2

yj

.= Rjyj = RjA

(j)x

.= A(j)

x ∈ Rd. (7)

2Notice that Rj is not computed on the sensor node. These matrices are computed offline and simply stored on each sensor node.

6

As a result, the action feature yj is more efficient to transmit than yj in the original data space Dj . On the network server,

the global action vector is of the following form:

y =

y1

y2...

yL

=

R1 0 · · · 00 R2 · · · 0

......

0 0 · · · RL

y1

y2...

yL

.= Ry ∈ R

dL, (8)

where R ∈ RdL×D is equivalent to a global projection matrix.

After the projection Rj , typically the feature dimension d is much smaller than the number n of all training samples.

Therefore, the new linear system (7) is underdetermined. Numerically stable solutions exist to uniquely recover sparse solutions

x∗ via `1-minimization [7]:

x∗ = arg min ‖x‖1 subject to yj = A(j)

x. (9)

In our experiment, we have tested multiple projection operators including PCA, LDA, and random project advocated in [22].

We found that 10-D feature spaces using LDA lead to best recognition in a very low-dimensional space.

After the (sparsest) representation x is recovered, we project the coefficients onto each action subspaces

δi(x) = (0, · · · , 0,xTi , 0, · · · , 0)T ∈ R

n, i = 1, · · · , K. (10)

Finally, the membership of the test sample yj is assigned to the class with the smallest residual

label(yj) = arg mini

‖yj − A(j)δi(x)‖2. (11)

Example 1 (Classification on Nodes): We designed 12 action categories in the experiment: Stand-to-Sit, Sit-to-Stand, Sit-to-

Lie, Lie-to-Sit, Stand-to-Kneel, Kneel-to-Stand, Rotate-Right, Rotate-Left, Bend, Jump, Upstairs, and Downstairs. The detailed

experiment setup is given in Section V.

To implement `1-minimization on the sensor node, we look for fast sparse solvers in the literature. We have tested a variety

of methods including (orthogonal) matching pursuit (MP), basis pursuit (BP), LASSO, and a quadratic log-barrier solver.3 We

found that BP [8] gives the best trade-off between speed, noise tolerance, and recognition accuracy.

Here we demonstrate the accuracy of the BP-based algorithm on each sensor node (see Fig 1 for their locations). The actions

are manually segmented from a set of long motion sequences from three subjects. In total there are 626 samples in the data set.

The 10-D feature selection is via LDA. We require the classification to be identity-independent. Therefore, for each test sample

from a subject, we use all samples from the other two subjects to form the training set. The accuracy of the classification is

shown in Table I. Fig 6 shows an example of the estimated sparse coefficients x and its residuals. In terms of the speed, our

simulation in MATLAB takes in average 0.03s to process one test sample on a typical 3G PC.

TABLE IRECOGNITION ON EACH NODE ON 12 ACTION CLASSES.

Sen # 1 2 3 4 5 6 7 8

Acc [%] 99.9 99.4 99.9 100 95.3 99.5 93 100

Fig. 6. A BP-based `1 solution and its corresponding residuals of a Stand-to-Sit action on the waist node. The action is correctly classified

as class 1. SCI(x) = 0.7 (see (13)).

Example 1 shows that if the segmentation of the actions is known and there is no other invalid samples, all sensor nodes can

recognize the 12 actions individually with very high accuracy, which also verifies that the mixture subspace model is a good

approximation of the action data. Nevertheless, one may question that in such low-dimensional feature spaces other classical

methods (e.g., kNN and decision tree methods) should also perform well. In the next section, we will show that the major

advantage of adopting the sparse representation framework is a unified solution to recognize and segment valid actions and

reject invalid ones. We will also show that the method is adaptive to the change of available sensor nodes on the fly.

3The implementation of these routines in MATLAB is available in SparseLab: http://sparselab.stanford.edu

7

IV. DISTRIBUTED SEGMENTATION AND RECOGNITION

There have been two major approaches in the past to provide partial solutions to simultaneous segmentation and recognition

of human actions on wearable sensors. The first solution assumes different actions are separated by a “rest” state, and such

states can be detected by energy thresholding or a special classifier to distinguish between rest and non-rest. The second

solution assumes all sensors in the network are available at all time, and rejects invalid samples based on the sample distance

between the test and training examples. These two approaches have several drawbacks: 1. For the first approach, the validity

of the rest state between actions is not physically guaranteed. For example, nontransient actions such as walking and running

may last for a long period. 2. The second approach is not robust when the number of active sensors changes over time. In

this case, tuning a list of different distance thresholds to reject outliers when the number of sensors changes can be difficult,

which still highly depends on the condition on the training samples.

We propose a novel framework to simultaneously segment and recognize human actions using the (10-D LDA) action features

extracted from a network of distributed sensors. The unified outlier rejection method applies to both individual nodes and the

global classifier. The outlying action segments may be caused by unknown actions performed by the subjects or by incorrect

segmentation. As a result, the extracted inlying action segments simultaneously provide the segmentation of the actions and

their labels. The framework is also robust w.r.t. different action durations and the change of available sensor nodes.

We first introduce multi-resolution action detection on each sensor node. From the training examples, we can estimate a range

of possible lengths for all actions of interest. We then evenly divide the range into multiple length hypotheses: (h1, · · · , hs).At each time t in a motion sequence, the node tests a set of s possible segmentations: 4

y(1) = (a(t − h1), · · · , a(t)), · · · ,y(s) = (a(t − hs), · · · , a(t)), (12)

as shown in Fig 7. With each candidate y normalized to length l, a sparse representation x is estimated using `1-minimization

in Section III.

Fig. 7. Multiple segmentation hypotheses on a wrist sensor at time t = 150 of a “go downstairs” sequence. h1 is a good segment whileothers are false segments. Notice that the movement between 250 and 350 is an outlying action the subject performed.

Based on the previous sparsity assumption, if y is not a valid segmentation w.r.t. the training examples due to either incorrect

t or h, or the real action performed is not in the training classes, the dominant coefficients of its sparsest representation x

should not correspond to any single class (as shown in Fig 8). We use a sparsity concentration index (SCI) [22]:

SCI(x).=

K · maxj=1,··· ,K ‖δj(x)‖1/‖x‖1 − 1

K − 1∈ [0, 1]. (13)

If the nonzero coefficients of x are evenly distributed among K classes, then SCI(x) = 0; if all the nonzero coefficients are

associated with a single class, then SCI(x) = 1. Therefore, we introduce a sparsity threshold τ1 applied to all sensor nodes:

If SCI(x) > τ1, the segment is a valid local measurement, and its 10-D LDA features y will be sent to the base station.

Fig. 8. The `1 solution and corresponding residuals of an outlying sample on the waist node. SCI(x) = 0.13.

Next, we introduce a global classifier that adaptively optimizes the overall segmentation and classification. Suppose at time

t and with a length hypothesis h, the base station receives L′ action features from the active sensors (L′ ≤ L). Without loss

4A segmentation candidate should be ignored if it overlaps with a previously detected result.

8

of generality, assume these features are from the first L′ sensors: y1, y2, · · · , yL′ . Let

y′ = (yT

1 , · · · , yTL′)T ∈ R

10L′

. (14)

Then the global sparse representation x of y′ satisfies the following linear system

y′ =

R1 · · · 0 · · · 0...

. . ....

...

0 · · · RL′ · · · 0

Ax = R′Ax = A′

x, (15)

where R′ ∈ RdL′

×D is a new projection matrix that only extracts the action features from the first L′ nodes. Consequently, the

effect of changing active sensor nodes for the global classification is formulated via the global projection matrix R′. During

the transformation, the data matrix A and the sparse representation x remain unchanged. The two linear systems (7) and (8)

then become special cases of (15), where L′ = 1 and L, respectively.

Similar to the outlier rejection criterion we previously proposed on each node, we introduce a global rejection threshold τ2.

If SCI(x) > τ2 in (15), the most significant coefficients in x are concentrated in a single training class. Hence y′ is assigned

to that class, and its length hypothesis h provides the segmentation of the action from the motion sequence.5

The overall algorithm on the nodes and on the network server provides a unified solution to segment and classify action

segments from a motion sequence using only two simple parameters τ1 and τ2. Typically τ1 is selected to be less restricted than

τ2 in order to increase the recall rate, because passing certain amounts of false signal to the global classifier is not necessarily

disastrous as the signal would be rejected by τ2 when the action features from multiple nodes are jointly considered.

Finally, we consider how the change of active nodes affects the estimation of x and the classification of the actions. In

compressed sensing, the efficacy of `1-minimization in solving for the sparsest solution x in (15) is characterized by the `0/`1

equivalence relation [7], [8]. A necessary and sufficient condition for the equivalence to hold is the k-neighborliness of A′. As

a special case, one can show that if x is the sparsest solution in (15) for L′ = L, x is also a solution for L′ < L. Hence, the

decrease of L′ leads to possible sparser solutions of x.

On the other hand, the decrease in available action features also makes y′ less discriminant. For example, if we reduce

L′ = 1 and only activate a wrist sensor, then the `1 solution x may have nonzero coefficients associated to multiple actions

with similar wrist motions, albeit sparser. This is an inherent problem for any method to classify human actions using a limited

number of motion sensors. In theory, if two action subspaces in a low-dimensional feature space have a small subspace distance

after the projection, the corresponding sparse representation cannot distinguish the test samples from the two classes. We will

demonstrate in Section V that indeed reducing the available motion sensors will reduce the discriminant power of the action

features in a lower-dimensional space.

In summary, the formulation of adaptive global classification (15) via a global projection matrix R′ compares favorably

to other classical methods such as kNN and decision trees mainly for the following two reasons: 1. The framework provides

a simple means to reject outliers via two sparsity constraints τ1 and τ2. 2. The effects of changing action features can be

quantitatively studied via R′ and its `0/`1 equivalence.

V. EXPERIMENT

We test the performance of the system using a data set we collected from three male subjects at the age of 28, 30, and 32,

respectively. Eight wearable sensors were placed at different body locations (see Fig 1). We designed a set of 12 action classes:

Stand-to-Sit (StSi), Sit-to-Stand (SiSt), Sit-to-Lie (SiLi), Lie-to-Sit (LiSi), Stand-to-Kneel (StKn), Kneel-to-Stand (KnSt), Rotate-

Right (RoR), Rotate-Left (RoL), Bend, Jump, Upstairs (Up), and Downstairs (Down). We are particularly interested in testing

the system under various action durations. For this purpose, we have asked the subjects to perform StSi, SiSt, SiLi, and LiSi

with two different speeds (slow and fast), and perform RoR and RoL with two different rotation angles (90◦ and 180◦). All

subjects were asked to perform a sequence of related actions in each recording session based on their own interpretation of

the actions (e.g., Fig 3). In total there are 626 actions performed in the data set (see Table III for the numbers in individual

classes).

We demonstrate the distributed recognition algorithm against three criteria: 1. What is the accuracy of the algorithm with

all 8 sensors activated, and how well can the global classifier adjust when a certain number of nodes are dropped from the

network. 2. Whether a set of heuristically selected parameters {τ1, τ2} can effectively segment valid actions with different

available nodes. 3. How much communication can be reduced via each node rejecting local measurement compared to simply

streaming all action features to the base station.

Table II shows the accuracy of the algorithm in terms of Precision versus Recall and with different sets of sensor nodes.

For all experiments, τ1 = 0.2 and τ2 = 0.4. If all nodes are activated, the algorithm can achieve 98.8% accuracy among the

actions it extracted, and 94.2% of the true actions are detected. The performance decreases gracefully when more nodes become

5At time t, if multiple hypotheses pass the rejection threshold τ2, one may heuristically select one based on his/her preference for longer or shorter segments,or other heuristics such as the number of active sensors.

9

unavailable to the global classifier. Our results show that if we can maintain one motion sensor for the upper body (e.g., at

position 2) and one for the lower body (e.g., at position 7), the algorithm can still achieve 94.4% precision and 82.5% recall.

Finally, in average the 8 distributed classifiers that reject invalid local measurements reduce the node-to-station communication

for above 50%. Please refer to the Appendix for the rendering of the segmentation results on the motion sequences.

TABLE IIPRECISION VS. RECALL WITH DIFFERENT SETS OF ACTIVATED SENSORS.

Sensors 2 7 2,7 1,2,7 1- 3, 7,8 1- 8

Prec [%] 89.8 94.6 94.4 92.8 94.6 98.8

Rec [%] 65 61.5 82.5 80.6 89.5 94.2

One may be curious about the relatively low recall on single sensors such as 2 and 7, particularly compared to the results

in Table I. This performance difference is due to the large number of potential outlying segments presented in a long motion

sequence (e.g., see Fig 7). We can further compare the difference using two confusion tables III and IV. We see that a single

node 2 that is positioned on the right wrist performed poorly mainly on two action categories: Stand-Kneel and Upstairs-

Downstairs, both of which involve significant movements of the lower body but not the upper one. This is the main reason

for the low recall in Table II. On the other hand, for the actions that are detected using node 2, our system can still achieve

about 90% accuracy, which clearly demonstrates the robustness of the distributed recognition framework. Similar arguments

also apply to node 7 and other sensor combinations.

TABLE IIICONFUSION TABLE USING SENSORS 1-8.

TABLE IVCONFUSION TABLE USING SENSOR 2.

VI. CONCLUSION AND DISCUSSION

Inspired by the emerging compressed sensing theory, we have proposed a distributed recognition framework to segment

and classify human actions on a wearable motion sensor network. The framework provides a unified solution based on `1-

minimization to classify valid action segments and reject outlying actions on the sensor nodes and the base station. We have

shown through our experiment that a set of 12 action classes can be accurately represented and classified using a set of 10-D

10

LDA features measured at multiple body locations. The proposed global classifier can adaptively adjust the global optimization

to boost the recognition upon available local measurements.

One limitation in the current system is that the wearable sensors need to be firmly fastened at the designated locations.

However, a more practical system/algorithm should tolerate certain degrees of offsets without sacrificing the accuracy. In this

case, the variation of the measurement for different action classes would increase substantially. One open question is what

low-dimensional linear/nonlinear models one may use to model such more complex data, and whether the sparse representation

framework can still apply to approximate such structures with limited numbers of training examples. A potential solution to

this question will be a meaningful step forward both in theory and in practice.

REFERENCES

[1] R. Aylward and J. Paradiso. A compact, high-speed, wearable sensor network for biomotion capture and interactive media. In Proceedings of the

International Conference on Information Processing in Sensor Networks, 2007.[2] L. Bao and S. Intille. Activity recognition from user-annotated acceleration data. In Proceedings of the International Conference on Pervasive Computing,

2004.[3] A. Benbasat and J. Paradiso. Groggy wakeup - automated generation of power-efficient detection hierarchies for wearable sensors. In Proceedings of

International Workshop on Wearable and Implantable Body Sensor Networks, 2007.[4] E. Candes. Compressive sampling. In Proceedings of the International Congress of Mathematicians, 2006.[5] E. Candes and T. Tao. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory,

52(12):5406–5425, 2006.[6] C. Chang and H. Aghajan. Collaborative face orientation detection in wireless image sensor networks. In Proceedings of Distributed Smart Cameras

Workshop, 2006.[7] D. Donoho. Neighborly polytopes and sparse solution of underdetermined linear equations. preprint, 2005.[8] D. Donoho and M. Elad. On the stability of the basis pursuit in the presence of noise. Signal Processing, 86:511–532, 2006.[9] J. Farringdon, A. Moore, N. Tilbury, J. Church, and P. Biemond. Wearable sensor badge & sensor jacket for context awareness. In Proceedings of the

International Symposium on Wearable Computers, pages 107–113, 1999.[10] E. Heinz, K. Kunze, and S. Sulistyo. Experimental evaluation of variations in primary features used for accelerometric context recognition. In Proceedings

of the European Symposium on Ambient Intelligence, 2003.[11] T. Huynh and B. Schiele. Analyzing features for activity recognition. In Proceedings of the Joint Conference on Smart Objects and Ambient Intelligence,

2005.[12] H. Kemper and R. Verschuur. Validity and reliability of pedometers in habitual activity research. European Journal of Applied Physiology, 37(1):71–82,

1977.[13] N. Kern, B. Schiele, and A. Schmidt. Multi-sensor activity context detection for wearable computing. In Proceedings of the European Symposium on

Ambient Intelligence, 2003.[14] I. Kim, J. Shim, J. Schlessman, and W. Wolf. Remote wireless face recognition employing ZigBee. In Proceedings of the Distributed Smart Cameras

Workshop, 2006.[15] A. Klausner, A. Tengg, and B. Rinner. Vehicle classifcation on multi-sensor smart cameras using feature- and decision-fusion. In Proceedings of the

ACM/IEEE International Conference on Distributed Smart Cameras, 2007.[16] P. Lukowicz, J. Ward, H. Junker, M. Stager, G. Troster, A. Atrash, and T. Starner. Recognizing workshop activity using body worn microphones and

accelerometers. In Proceedings of the International Conference on Pervasive Computing, 2004.[17] J. Mantyjarvi, J. Himberg, and T. Seppanen. Recognizing human motion with multiple acceleration sensors. In Proceedings of the IEEE International

Conference on Systems, Man and Cybernetics, 2001.[18] J. Morrill. Distributed recognition of patterns in time series data. Communications of the ACM, 41(5):45–51, 1998.[19] B. Najafi, K. Aminian, A. Parschiv-Ionescu, F. Loew, C. Bula, and P. Robert. Ambulatory system for human motion analysis using a kinematic sensor:

Monitoring of daily physical activity in the elderly. IEEE Transactions on Biomedical Engineering, 50(6):711–723, 2003.[20] S. Pirttikangas, K. Fujinami, and T. Nakajima. Feature selection and activity recognition from wearable sensors. In Proceedings of the International

Symposium on Ubiquitous Computing Systems, 2006.[21] C. Sadler and M. Martonosi. Data compression algorithms for energy-constrained devices in delay tolerant networks. In Proceedings of the ACM

Conference on Embedded Networked Sensor Systems, pages 265–278, 2006.[22] A. Yang, J. Wright, Y. Ma, and S. Sastry. Feature selection in face recognition: A sparse representation perspective. Technical Report UCB/EECS-2007-99,

University of California, Berkeley, 2007.[23] W. Zhang, L. He, Y. Chow, R. Yang, and Y. Su. The study on distributed speech recognition system. In Proceedings of the IEEE International Conference

on Acoustics, Speech, and Signal Processing, pages 1431–1434, 2000.

11

APPENDIX

In this appendix, we provide detailed classification results to demonstrate the accuracy of the proposed algorithm using all

1 - 8 sensor nodes. For clarity, each figure in Fig 9 - 21 only plots the readings from x-axis accelerometers on the 8 nodes

for three motion sequences performed by the three subjects, respectively. The segmentation results are then superimposed. The

black solid boxes indicate the locations of the correctly classified action segments. The red boxes (e.g., in Fig 12 and 13)

indicate the locations of the false classification. One can also observe from the figures that some valid actions are not detected

by the algorithm, e.g., in Fig 20.

The results clearly demonstrate that the proposed algorithm can accurately segment and classify the 12 action classes with

widely different durations. The overall statistics about Precision versus Recall was summarized in Table III.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 9. Segmentation of the slow Stand-Sit-Stand sequences from the three subjects.

12

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 10. Segmentation of the fast Stand-Sit-Stand sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 11. Segmentation of the slow Sit-Lie-Sit sequences from the three subjects.

13

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 12. Segmentation of the fast Sit-Lie-Sit sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 13. Segmentation of the Bend sequences from the three subjects.

14

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 14. Segmentation of the Stand-Kneel-Stand sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 15. Segmentation of the 90◦ Rotate-Right-Left sequences from the three subjects.

15

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 16. Segmentation of the 90◦ Rotate-Left-Right sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 17. Segmentation of the 180◦ Rotate-Right sequences from the three subjects.

16

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 18. Segmentation of the 180◦ Rotate-Left sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 19. Segmentation of the Jump sequences from the three subjects.

17

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 20. Segmentation of the Go-Upstairs sequences from the three subjects.

(a) Subject 1

(b) Subject 2

(c) Subject 3

Fig. 21. Segmentation of the Go-Downstairs sequences from the three subjects.

Journal of Ambient Intelligence and Smart Environments 1 (2009) 1–5 1IOS Press

Distributed Recognition of Human Actions

Using Wearable Motion Sensor Networks 1

Allen Y. Yang a,∗, Roozbeh Jafari b, S. Shankar Sastry a, and Ruzena Bajcsy a

a Department of EECS, University of California, Berkeley

Berkeley, CA 94705, USA

E-mail: {yang,sastry,bajcsy}@eecs.berkeley.edub Department of EE, University of Texas at Dallas

Richardson, TX 75083, USA

E-mail: [email protected]

Abstract. We propose a distributed recognition framework to classify continuous human actions using a low-bandwidth wearable

motion sensor network, called distributed sparsity classifier (DSC). The algorithm classifies human actions using a set of training

motion sequences as prior examples. It is also capable of rejecting outlying actions that are not in the training categories.

The classification is operated in a distributed fashion on individual sensor nodes and a base station computer. We model the

distribution of multiple action classes as a mixture subspace model, one subspace for each action class. Given a new test sample,

we seek the sparsest linear representation of the sample w.r.t. all training examples. We show that the dominant coefficients in

the representation only correspond to the action class of the test sample, and hence its membership is encoded in the sparse

representation. Fast linear solvers are provided to compute such representation via `1-minimization. To validate the accuracy

of the framework, a public wearable action recognition database is constructed, called wearable action recognition database

(WARD). The database is comprised of 20 human subjects in 13 action categories. Using up to five motion sensors in the WARD

database, DSC achieves state-of-the-art performance. We further show that the recognition precision only decreases gracefully

using smaller subsets of active sensors. It validates the robustness of the distributed recognition framework on an unreliable

wireless network. It also demonstrates the ability of DSC to conserve sensor energy for communication while preserve accurate

global classification.

Keywords: action recognition, wearable sensor network, distributed perception, sparse representation, compressive sensing

1. Introduction

Action/activities recognition has been extensively

studied in the past in the literature of computer vision.

Compared with either model-based or appearance-

based vision systems, body sensor networks that we

study in this paper have several distinct advantages: 1.

Body sensor systems do not require to instrument the

environment with cameras or other sensors. 2. Such

1This work was partially supported by ARO MURI W911NF-06-

1-0076, NSF TRUST Center, and the startup funding from the Uni-

versity of Texas and Texas Instruments.*Corresponding author. E-mail: [email protected].

systems also have the necessary mobility to support

persistent monitoring of a subject during her daily ac-

tivities in both indoor and outdoor environments. 3.

With the continuing miniaturization and integration of

mobile processors and wireless sensors, it has become

possible to manufacture wearable sensor networks that

densely cover the human body to record and analyze

very small movements of the human body (e.g., breath-

ing and spine movements) with higher accuracy than

most extant vision systems. Such sensor networks can

be used in applications such as medical-care moni-

toring, athlete training, tele-immersion, and human-

computer interaction (e.g., integration of accelerome-

ters in Wii game controllers and smart phones).

1876-1364/09/$17.00 c© 2009 – IOS Press and the authors. All rights reserved

2 A. Yang et al. / Distributed recognition of human activities using wearable motion sensor networks

Fig. 1. A subject wearing a body sensor network with the numbering

of the sensors superimposed in the image. The sensor system con-

sists of five wireless motion sensors, two on the wrists, one on the

waist, and two on the ankles, respectively.

In traditional sensor networks, the computation car-

ried by the sensor board is fairly simple: Extract cer-

tain local information and transmit the data to a com-

puter server over the network for processing. In this

paper, we propose a new method for distributed pat-

tern recognition. In this system, each sensor node will

be able to classify local, albeit biased, information.

Only when the local classification detects a possible

object/event does the sensor node become active and

transmit the measurement to a network server. 1 On the

server side, a global classifier receives data from the

sensor nodes and further optimizes the classification

upon local sensor decisions. The global classifier can

be more computationally involved than the distributed

classifiers, but it has to adapt to the change of avail-

able network sensors due to local measurement error,

sensor failure, and communication congestion.

1.1. Literature Overview

Past studies on sensor-based action recognition were

primarily focused on single accelerometers [12,15] or

other motion sensors [16,23]. More recent systems

prefer using multiple motion sensors [19,17,14,2,18,

1Studies have shown that the power consumption required to

successfully send one byte over a wireless channel is equivalent

to executing between 1e3 and 1e6 instructions on an onboard

processor[26]. Hence it is paramount in sensor networks to reduce

the communication cost while preserve the recognition performance.

25,1]. Depending on the type of sensor used, an action

recognition system is typically comprised of two parts:

a feature extraction module at the sensor level and a

classification module at the server level.

There are three major directions for feature extrac-

tion in wearable sensor networks. The first direction

uses simple statistics in a motion sequence such as the

max, mean, variance, and energy. The second type of

feature is computed using fixed filter banks such as

FFT and wavelets [23,15]. The third type is based on

classical dimensionality reduction techniques such as

principal component analysis (PCA) and linear dis-

criminant analysis (LDA) [19,18].

In terms of classification on the action features, a

large body of previous work favored thresholding or

k-nearest-neighbor (kNN) due to the simplicity of the

algorithms for mobile devices [23,15,25]. Other more

sophisticated techniques have also been used, such as

decision trees [2,4] and hidden Markov models [18].

For distributed pattern recognition, there exist stud-

ies on distributed speech recognition [31] and dis-

tributed expert systems [22]. One particular problem

associated with most distributed sensor systems is that

each local observation from the distributed sensors is

biased and insufficient to classify all classes. For ex-

ample in our system, the sensors placed on the lower-

body would not perform well to classify those ac-

tions that mainly involve upper body motions, and vice

versa. Consequently, traditional majority-voting type

classifiers may not achieve the best performance glob-

ally.

Due to the unique mobility of wearable sensor net-

works, such systems have been applied to a vari-

ety of applications, especially in the area of human-

computer interaction. One dominant application in the

past has been single action detection for elderly peo-

ple, such as falling [29,9,27,7] and walking [24,3].

There have been other systems that tackle more gen-

eral problems of recognizing multiple different human

actions/activities that would be commonplace in peo-

ple’s daily lives [21,20,18,8]. The algorithm proposed

in this paper falls in the latter category.

1.2. Design of the Wearable Sensor Network

Our wearable sensor network consists of five sensor

nodes placed at different body locations (see Figure 1),

which communicate with a base station attached to a

computer server through a USB port. The sensor nodes

and base station are built using the commercially avail-

able Tmote Sky boards. Tmote Sky runs TinyOS on

A. Yang et al. / Distributed recognition of human activities using wearable motion sensor networks 3

an 8MHz microcontroller with 10K RAM and com-

municates using the 802.15.4 wireless protocol. Each

custom-built sensor board has a triaxial accelerometer

and a biaxial gyroscope, which is attached to Tmote

Sky (shown in Figure 2). Each axis is reported as a

12bit value to the node, indicating values in the range

of ±2g and ±500◦/s for the accelerometer and gyro-

scope, respectively.

Fig. 2. Illustration of a motion sensor node. The sensor board on the

top is a custom-built motion sensor with a triaxial accelerometer and

a biaxial gyroscope. The middle layer contains a Li-ion battery. The

sensor board on the bottom is a standard Tmote Sky network node.

The current hardware design of the sensor con-

tributes certain amounts of measurement error. The ac-

celerometers typically require some calibration in the

form of a linear correction, as sensor output under 1gmay be shifted up to 15% in some sensors. It is also

worth noting that the gyroscopes produce an indica-

tion of rotation under straight line motions. Fortunately

these systematic errors appear to be consistent across

experiments for a given sensor board. However, with-

out calibration to correct them, the errors may affect

the action recognition if different sets of sensors are

used interchangeably in the experiment. 2

To avoid packet collision in the wireless channel,

we use a time division multiple access (TDMA) proto-

col that allocates each node a specific time slot during

which to transmit data. This allows us to receive sensor

data at 30Hz with minimal packet loss. To avoid drift

in the network, the base station periodically broadcasts

a packet to resynchronize the nodes’ individual timers.

The code to interface with the sensors and transmit

data is implemented directly on the motes using nesC,

a variant of C.

2More sophisticated motion sensors do exist in the industry,

which can utilize heterogeneous sensor fusion techniques to self-

calibrate the accelerometer and gyroscope. One example is the Mi-

crostrain Gyro Enhanced Orientation Sensor at: http://www.

microstrain.com/.

1.3. Wearable Action Recognition Database

We have constructed a benchmark database for hu-

man action recognition using the above wearable mo-

tion sensor network, called Wearable Action Recogni-

tion Database (WARD). The purpose of WARD is to

offer a public and relatively stable data set as a plat-

form for quantitative comparison of existing and future

algorithms for human action recognition using wear-

able motion sensors. The database has been carefully

constructed under the following conditions:

1. The database contains sufficient numbers of hu-

man subjects with a large range of age differ-

ences.

2. The designed action classes are general enough

to cover most typical activities that a human sub-

ject is expected to perform in her daily life.

3. The locations of the wearable sensors are se-

lected to be practical for full-fledged commercial

systems.

4. The sampled action data contain sufficient varia-

tion, measurement noise, and outliers in order for

existing and future algorithms to meaningfully

examine and compare their performance.

The WARD database is available for download at:

http://www.eecs.berkeley.edu/~yang/software/

WAR/. The data are sampled from 7 female and 13

male human subjects (in total 20 subjects) with age

ranging from 19 to 75. The current version, version

1.0, includes the following 13 action categories: 1.

Stand (ST). 2. Sit (SI). 3. Lie down (LI). 4. Walk for-

ward (WF). 5. Walk left-circle (WL). 6. Walk right-

circle (WR). 7. Turn left (TL). 8. Turn right (TR). 9. Go

upstairs (UP). 10. Go downstairs (DO). 11. Jog (JO).

12. Jump (JU). 13. Push wheelchair (PU). For more

details about the data collection, please refer to the hu-

man subject protocol included in the WARD database.

The sensor data have been converted and saved in the

MATLAB environment. The database also includes a

MATLAB program to visualize the action data from

the five motion sensors.

1.4. Contribution

We propose a distributed action recognition algo-

rithm using up to five wearable motion sensors. The

work is inspired by an emerging theory of compressive

sensing [5,6]. We assume each action class satisfies a

low-dimensional subspace model. If a linear represen-

tation is sought to represent a valid test sample w.r.t.


all training samples, the dominant coefficients in the

sparsest representation correspond to the training sam-

ples from the same action class, and hence they encode

the membership of the test sample.

A distributed recognition system on wireless sensor

networks needs to further consider the following is-

sues:

1. How to extract compact and accurate low-dimensional

action features for local classification and trans-

mission over a band-limited network?

2. How to classify the local measurement efficiently

using low-power processors?

3. How to design a classifier to globally optimize

the recognition and adapt to the change of the

network?

4. Whether the accuracy of an action recognition

system is identity independent? That is, a good

classifier should only be sensitive to different ac-

tion classes, but neutral to the subject who per-

forms the actions.

We tackle these problems by proposing a novel

recognition framework consisting of the following

three integrated components: 1. Low-dimensional ac-

tion feature extraction. 2. Fast distributed classifiers

via `1-minimization. 3. An adaptive global classifier

on the base computer. The method can accurately

classify human actions from a continuous motion se-

quence. The local classifiers that reject potential out-

liers can reduce the sensor-to-server communication

to about 50%. One can also choose to activate only

a subset of the sensors on the fly due to sensor fail-

ure or network congestion. The global classifier is able

to adaptively update the optimization process and im-

prove the global classification upon available local de-

cisions. Finally, in the experiment, we examine the

identity-independence performance on a test sequence

by excluding the training samples of the same subject.

Note that a similar algorithm was previously pub-

lished in a manuscript [28]. In comparison, [28] mainly

discusses simultaneous segmentation and classifica-

tion of transient actions, such as from standing to sit-

ting, from sitting to lying down, and bending. In this

paper, we discuss classification of continuous actions.

The preliminary results shown in [28] only contain

recognition results from three human subjects with age

ranging from 28 to 32. In this paper, the system uti-

lizes the much larger WARD benchmark to validate its

performance.

The rest of the paper is organized as follows. Sec-

tion 2 proposes a unified classification algorithm via

a novel sparse representation framework on individual

motion sensors to classify human actions with local

bias. Section 3 further proposes a global classification

algorithm on a base computer that receives action fea-

tures from active sensors in the network and adaptively

boost the recognition upon individual sensor decisions.

Finally, we demonstrate the performance of the overall

algorithm based on the WARD benchmark in Section

4.

2. Classification via Sparse Representation

We first define the problem of distributed action

recognition.

Problem 1 (Distributed Action Recognition) Assume

a set of L wearable sensor nodes with triaxial ac-

celerometers (x, y, z) and biaxial gyroscopes (θ, ρ)are attached to the human body. Denote

aj(t).= (xj(t), yj(t), zj(t), θj(t), ρj(t))

T ∈ R5 (1)

as the measurement of the five readings on node j at

time t, and

a(t).= (aT

1 (t),aT2 (t), · · · ,aT

L(t))T ∈ R5L (2)

collects all L sensors at time t. Further denote

s = (a(1),a(2), · · · ,a(l)) ∈ R5L×l (3)

as an action segment of length l in time.

Given K different classes of human actions, a set

of ni training examples {si,1, · · · , si,ni} are collected

for each ith class, all of which have the same dura-

tion l. Given a new test sequence s, we seek a dis-

tributed algorithm to classify the action into one of the

K categories, or reject the action as an invalid mea-

surement. Finally, given continuous measurements of

different human activities, determine an optimal dura-

tion parameter l to extract training samples and test

samples s.

In this section, our focus should be an action clas-

sification method on each sensor node assuming an

action segment of a fixed duration l. Given sj =(aj(1),aj(2), · · · ,aj(l)) ∈ R

5×l on node j, define a

new vector sSj as the stacking of the l columns of sj :

sSj

.= (aj(1)T ,aj(2)T , · · · ,aj(l)

T )T ∈ R5l. (4)


We will interchangeably use sj to denote the stacked

vector sSj without causing ambiguity.

Subsequently, we define a full-body action vector v

that stacks the measurement from all L nodes:

v.= (sT

1 , sT2 , · · · , sT

L)T ∈ RD, (5)

where D = D1 + · · · + DL = 5lL.

In this paper, we assume the samples v in an action

class satisfy a subspace model, called an action sub-

space. If the training samples {v1, · · · ,vni} of the ith

class sufficiently span the ith action subspace, given a

test sample y = (yT1 , · · · ,yT

L)T ∈ RD in the same

class i, y can be linearly represented using the training

examples of the same class:

y = α1v1 + · · · + αnivni

⇔

y1

y2

...yL

=

s1

s2

...sL

1

· · ·

s1

s2

...sL

ni

α1

α2

...αni

.(6)

It is important to note that such linear constraint also

holds for each node j in (6):

yj = α1sj,1 + · · · + αnisj,ni

∈ RDj . (7)

In theory, complex data such as human actions typ-

ically constitute more complex nonlinear models. The

linear models are used to approximate such nonlin-

ear structures in a higher-dimensional subspace (see

Figure 3). Notice that such linear approximation may

not produce good estimation of the distance/similarity

metric for the samples on the manifold. However, as

we will show in Example 1, given sufficient samples

on the manifold as training examples, a new test sam-

ple can be accurately represented on the subspace, pro-

vided that any two classes do not have similar subspace

models.

Fig. 3. Modeling a 1-D manifold M using a 2-D subspace V .

To recover label(y), a previous study [30] proposes

to reformulate the recognition using a sparse represen-

tation: Since label(y) = i is unknown, we can rep-

resent y using all the training samples from all Kclasses:

y = (A1, A2, · · · , AK)

x1

x2

...xK

= Ax, (8)

where

Ai = (vi,1,vi,2, · · · ,vi,ni) ∈ R

D×ni (9)

collects all the training samples of class i,

xi = (αi,1, αi,2, · · · , αi,ni)T ∈ R

ni (10)

collects the corresponding coefficients in (6), and A ∈R

D×n where n = n1+n2+· · ·+nK . Since y satisfies

both (6) and (8), one solution of x in (8) should be

x∗ = (0, · · · , 0,xT

i , 0, · · · , 0)T . (11)

The solution is naturally sparse: in average only 1K

terms in x∗ are nonzero.

It is important to note that, on each sensor j in this

section, solution x∗ of (8) is also a solution for the

representation:

yj = (A(j)1 , A

(j)2 , · · · , A

(j)K )x = A(j)

x, (12)

where A(j)i ∈ R

Dj×ni consists of row vectors in Ai

that correspond to the jth node. Hence, x∗ can be

solved either globally using (8) or locally using (12),

provided that the action data measured on each node

are sufficiently discriminant. We will come back to the

discussion about local classification versus global clas-

sification in Section 3. In the rest of this section how-

ever, our focus will be on each node.

One major difficulty in solving (12) is the high di-

mensionality of the action data. In compressive sens-

ing [5,6], one reduces the dimension of a linear system

by choosing a linear projection Rj ∈ Rd×Dj :3

yj

.= Rjyj = RjA

(j)x

.= A(j)

x ∈ Rd. (13)

After projection Rj , typically the feature dimension

d is much smaller than the number n of all train-

ing samples. Therefore, the new linear system (13)

is underdetermined. Numerically stable solutions ex-

3Notice that Rj is not computed on the sensor node. These matri-

ces are computed offline and simply stored on each sensor node.


ist to uniquely recover sparse solutions x∗ via `1-

minimization[10]:


x. (14)

These routines include (orthogonal) matching pursuit

(MP), basis pursuit (BP), the LASSO.4

In our experiment, we have tested multiple projec-

tion operators including PCA, LDA, locality preserv-

ing projection (LPP) [13], and random project stud-

ied in [30]. We found that 40-D feature spaces us-

ing LPP produces the best recognition in a very low-

dimensional space. Throughout this paper, we will use

40-D LPP features to represent local motions mea-

sured on sensor nodes.5

After the (sparsest) representation x is recovered,

we project the coefficients onto each action subspaces

δi(x) = (0, · · · , 0,xTi , 0, · · · , 0)T ∈ R

n, i = 1, · · · , K.

(15)

Subsequently, the membership of the test sample yj is

assigned to the class with the smallest residual


‖yj − A(j)δi(x)‖2. (16)

The overall algorithm deployed on each sensor node

is summarized in Algorithm 1, which is called local

sparsity classifier (LSC).

Example 1 (Classification on Nodes) We demonstrate

the recognition accuracy of LSC on individual nodes

based on the WARD database. First, we look for fast

sparse solvers in the literature. We found that BP [11]

gives the best trade-off between speed, noise tolerance,

and recognition accuracy.

We design the training set and the test set as follows.

For each motion sequence in the WARD database, we

4The implementation of these routines is available in a MATLAB

toolbox called SparseLab: http://sparselab.stanford.

edu.5The choice of an “optimal” low-dimensional feature space is not

the emphasis of this paper. On one hand, a practitioner may easily

replace LPP with other feature spaces without modification of the

algorithm. On the other hand, a previous result in [30] has shown that

the accuracy of sparse representation via `1-minimization converges

among different linear projections, as long as the dimension of the

feature space is sufficiently high. The result renders the choice of a

particular feature space not very significant in solving for a sparse

representation.

Algorithm 1 : Local Sparsity Classifier (LSC).

Input: A set of training samples A(j) = ( sj,1 ··· sj,n ),a test sample yj on a sensor node j, and a linear pro-

jection matrix Rj .

1: Projection: yj = Rjyj .

2: x∗ = arg min ‖x‖1 subject to yj = RjA

(j)x.

3: label(yj) = arg mini=1,··· ,K ‖yj −

RjA(j)δi(x)‖2.

Output: label(yj), action feature yj , and x∗.

randomly sample 10 segments of length l in the train-

ing set. In total, there are 20 × 13 × 5 × 10 = 13000training examples. During the testing, LSC attempts

to classify all continuous segments of length l in the

WARD database. With respect to each subject, the cor-

responding training examples will be excluded from

the training set before classification. Therefore, any

test subject is not present in the training set, and the

recognition is subject independent. In the experiment,

we found that l = 45 is a short action duration that

yields satisfactory performance, which corresponds to

1.5 seconds given the 30 Hz sampling rate.

Figure 4 illustrates an example of sparse represen-

tation x and its corresponding residuals estimated on

the first node (left wrist) of a jumping sequence (Action

12).

Fig. 4. Top: Sparse `1 solution by BP of a jumping motion on the

left wrist node. Bottom: Reconstruction residuals with respect to the

13 action categories. The test sample is correctly classified as Class

12. SCI(x) = 0.335 (see (17))

Table 1 shows the recognition accuracy of LSC.

There should be no surprise that LSC alone based on

single node measurement of human activities does not

produce good classification, as many human activities

engage movements at multiple body parts. For exam-


ple, nodes at the two ankle positions cannot differenti-

ate walking forward and pushing wheelchair because

the feet engage similar movements for both categories.

In Table 1, we also show the performance of a simple

global classifier: majority voting. If all the local deci-

sions are collected and a majority vote is chosen as the

overall classification of the test action, LSC achieves

90.2% accuracy. This will become a baseline bench-

mark to compare with an adaptive classifier we will

introduce in the next section.

Table 1

Recognition accuracy via LSC on each node. The last column (1–5)

shows the recognition accuracy using majority voting.

Sen # 1 2 3 4 5 1–5

Acc [%] 65.08 61.26 63.9 78.56 77 90.2

Nearest neighbor (NN) is one of the popular meth-

ods used in sensor networks for classification. Table 2

shows the recognition accuracy of NN on the WARD

database. We compare Table 1 and Table 2. Because

the inherent correlation between the distributed mo-

tion sensors are not considered beyond the majority-

voting process, the two algorithms generate very sim-

ilar global recognition accuracy. Using majority vot-

ing, nearest neighbor achieves 90.5%.

Table 2

Recognition accuracy via nearest neighbor on each node. The last

column (1–5) shows the recognition accuracy using majority voting.

Sen # 1 2 3 4 5 1–5

Acc [%] 64.9 59.3 67.4 80.3 76.0 90.5

3. Adaptive Global Recognition

In this section, we introduce an adaptive frame-

work to optimize a global classification based on all

the available distributed sensor data. First, we discuss

an outlier rejection criterion to identify invalid mo-

tion samples measured on the individual sensor nodes.

The invalid samples would not be sent to the global

classifier that we will introduce later. The ability to

locally reject invalid measurement reduces the power

consumption on the sensor nodes to communicate with

the network station, as we will show in Section 4.

Based on the previous sparsity assumption, if yj is

not a valid segment on node j w.r.t. the training ex-

amples A(j), the dominant coefficients of its sparsest

representation x should not correspond to any single

class. We utilize a sparsity concentration index (SCI)

[30]:

SCI(x).=

K · maxj=1,··· ,K ‖δj(x)‖1/‖x‖1 − 1

K − 1∈ [0, 1].

(17)

If the nonzero coefficients of x are evenly distributed

among K classes, then SCI(x) = 0; if all the nonzero

coefficients are associated with a single class, then

SCI(x) = 1. Therefore, we introduce a sparsity

threshold τ1 applied on individual sensor nodes: If

SCI(x) > τ1, the motion sample is a valid local mea-

surement, and its 40-D LPP features y will be sent to

the base station; otherwise, the sample will be ignored.

It is important to note that a local measurement that

is labeled as a valid sample w.r.t. τ1 may not truly

correspond a valid human action when multiple sen-

sor data are jointly considered based on the training

actions defined in the WARD database. For example,

WF, UP, and DO all involve similar upper body move-

ments; on the other hand, if a subject only tries to

mimic a WF motion by moving the upper body but not

the lower body, the movement becomes an invalid ac-

tion when both the upper body data and the lower body

data are jointly considered. Therefore, a global con-

straint is needed to reject such invalid samples, which

will be discussed next.

Suppose at time t, the base station receives L′ action

features from the active sensors (L′ ≤ L). Without loss

of generality, assume these features are from the first

L′ sensors: y1, y2, · · · , yL′ .

Denote

y′ = (yT

1 , · · · , yTL′)T ∈ R

dL′

. (18)

Then the global sparse representation x of y′ satisfies

the following linear system

y′ =

(

R1 ··· 0 ··· 0

.... . .

......

0 ··· RL′ ··· 0

)

Ax = R′Ax = A′x, (19)


×D is a new projection matrix that

only extracts the action features from the first L′ nodes.

Consequently, the effect of changing active sensor

nodes for the global classification is formulated via the

global projection matrix R′. During the transforma-

tion, the data matrix A and the sparse representation

x remain unchanged. The linear system (13) then be-

comes a special case of (19) where L′ = 1.


Similar to the outlier rejection criterion on each

node, we introduce a global rejection threshold τ2. If

SCI(x) > τ2 in (19), the most significant coefficients

in x are concentrated in a single training class. Hence

y′ is assigned to that class. Otherwise, the sample will

be rejected as an outlier. The overall algorithm on the

network station is summarized in Algorithm 2, which

is called distributed sparsity classifier (DSC). DSC

provides a unified solution to detect and classify action

segments in a network of body sensors using only two

simple parameters τ1 and τ2.

Algorithm 2 : Distributed Sparsity Classifier

(DSC).

Input: A set of stacked training samples A ={v1, · · · ,vn} from sensors 1, · · · , L, test sample y of

action features measured from L active sensors, and

sparsity parameters τ1, τ2.

1: for all each sensor 1 ≤ j ≤ L do

2: Solve for sparse representation x∗ using Algo-

rithm 1with parameters A(j) and yj .

3: If SCI(x∗) > τ1, send feature vector yj to the

network station.

4: end for

5: Collect all valid features y′, construct correspond-

ing training matrix A′.

6: Solve x∗ = arg min ‖x‖1 subject to y

′ = R′A′x.

7: if SCI(x∗) > τ2 then

8: label(y) = arg mini=1,··· ,K ‖y′ −R′A′δi(x)‖2.

9: else

10: label(y) = −1 (outlier).

11: end if

Output: label(y).

Example 2 (Distributed Sparsity Classifier) Consider

Action 13 in the WARD database, i.e., PU (pushing a

wheelchair). While the upper body motion of this ac-

tion is quite distinct, the lower body motion often re-

sembles several other actions in the database, such as

WF and UP. Figure 5 illustrates the `1 solutions on the

five individual sensor nodes.

First, we observe that the local sparsity classi-

fier (LSC) returns five different labels w.r.t. to the lo-

cal measurement on the five sensors. It shows that

majority-voting type solutions mostly should fail to

correctly classify this motion. Second, using a thresh-

old τ1 against the SCI values of the representations,

we can reject certain number of the local motions as

invalid measurements.

Assume τ1 = 0.1 is selected for all five sensors, then

measurements from Sensors 1 and 2 will be rejected

and DSC solves for a sparse representation using the

three 40-D action features from Sensors 3, 4, and 5.

Figure 6 shows the global `1 solution of (19), and the

full-body motion is correctly classified as from Action

13.

Fig. 6. Top: DSC sparse representation of a sample from action 13

in Figure 5. Assume τ1 = 0.1 and τ2 = 0.08, and Sensors 1 and

2 are rejected. Bottom: Reconstruction residuals with respect to the

13 action categories. The test sample is correctly classified as Class

13.

Notice that at the node level, none of Sensors 3, 4,

and 5 correctly classifies the action based on the avail-

able local observations, because they are also simi-

lar to other actions such as UP, TR, and WR. How-

ever, when the measurements from multiple sensors are

combined in (19) to represent the full-body motion, the

incorrect local decisions are rectified. Such ability is

the main reason that the proposed DSC framework can

outperform other majority-voting type algorithms. We

will examine the performance of DSC in more detail in

Section 4.

The DSC method compares favorably to other clas-

sical methods such as NN and decision trees, because

these methods need to train multiple thresholds and

outlier rejection rules when the number L′ and the set

of available sensors vary in the full-body action vector

y′ = (yT

1 , · · · , yTL′)T . Particularly, a global nearest-

neighbor (GNN) algorithm can be modeled as a special

case of sparse representation. Suppose in (19) there are

L′ active sensors and denote A′ = (v′

1,v′

2, · · · ,v′

n).


(a) Sparse representation of the left wrist motion. Local

classification label is 13 (PU).

(b) Sparse representation of the right wrist motion. Local

classification label is 4 (WF).

(c) Sparse representation of the waist motion. Local clas-

sification label is 9 (UP).

(d) Sparse representation of the left ankle motion. Local

classification label is 8 (TR).

(e) Sparse representation of the right ankle motion. Lo-

cal classification label is 6 (WR).

Fig. 5. Illustration of a PU motion (action 13) classified on individual sensor nodes. Each LSC estimates a different action category that correlates

to the true action. Compared to Figure 4, these solutions have much lower SCI values.

Then GNN solves for the following sparse representa-

tion of y′:

x∗ = (0, · · · , 0, 1i, 0, · · · , 0)T

subject to i = arg minj ‖y′ − v

′

j‖2.(20)

The optimal solution x∗ for GNN is clearly sparse with

only one nonzero coefficient corresponding to the clos-

est neighbor of y′ in the training set A′. The formula-

tion also generates to k-nearest-neighbors (kNN) and

other similar variations.

Finally, we consider how the change of active nodes

affects `1-minimization and the classification of the

actions. In compressive sensing, the efficacy of `1-

minimization in solving for the sparsest solution x in

(19) is characterized by the `0/`1 equivalence relation

[10,11]. A necessary and sufficient condition for the

equivalence to hold is the k-neighborliness of A′. As

a special case, one can show that if x is the sparsest

solution in (19) for L′ = L, x is also a solution for

L′ < L. Hence, the decrease of L′ leads to possible

sparser solutions of x.

On the other hand, the decrease in available action

features also makes y′ less discriminant. For example,

if we reduce L′ = 1 and only activate a wrist sen-

sor, then the `1-solution x may have nonzero coeffi-

cients associated to multiple actions with similar wrist

motions, albeit sparser. This is an inherent problem

for any method to classify human actions using a lim-

ited number of motion sensors. In theory, if two action

subspaces in a low-dimensional feature space have a

small subspace distance after the projection, the corre-

sponding sparse representation cannot distinguish the

test samples from the two classes. We will demonstrate

in Section 4 that indeed reducing the available mo-


tion sensors will reduce the discriminant power of the

sparse representation in a lower-dimensional space.

4. Experiment

In this section, we conduct extensive experiments to

examine the performance of the DSC framework us-

ing the WARD database. Two different scenarios are

considered. First, we calculate the classification accu-

racy with different subsets of motion sensors available

in the network. This experiment is intended to verify

that DSC is adaptive to the change of network config-

uration on-the-fly due to real-world conditions such as

sensor failure, battery failure, and network congestion.

Second, we consider the effect of the local outlier re-

jection threshold τ1 to the accuracy of the global clas-

sification: Higher rejection thresholds save power con-

sumption in communication at the expense of less lo-

cal information available to the global classifier, and

vice versa. It is important to note that to measure the

performance under the identity-independence assump-

tion, all training examples of a test subject should be

excluded from the training set during the experiment.

For each motion sequence in the WARD database, we

randomly sample 10 segments of length l = 45 as

training examples.

4.1. Classification with Different Network

Configurations

We first test the performance of DSC by manually

eliminating certain number of available sensors in the

network. Based on the total number L′ of LPP feature

vectors received, DSC is able to update the classifica-

tion criterion (19) on-the-fly and adapts to the poten-

tially adversary condition. Table 3 shows the perfor-

mance of the algorithm, which is quantified by false

positive rate (FPR), verification rate (VR), and active

sensor rate (ASR).6 For all the trials, the outlier rejec-

tion thresholds τ1 and τ2 are set to be 0.08, respec-

tively. The duration l of the test action length is set

to be 45, which corresponds to 1.5 seconds given the

30 Hz sampling rate. When all continuous action seg-

ments of length 45 are classified in the experiment, the

total number of test samples amounts to 500828.

6FPR is the percentage of samples that are either true outliers

falsely classified as inliers or true inliers assigned to the wrong

classes. VR is the percentage of samples that are correctly classified

as inliers. Note that the WARD database does not purposely con-

Table 3

Performance of DSC measured by false positive rate (FPR), verifi-

cation rate (VR), and active sensor rate (ASR).

Sen # 1-5 1,3,4 1,4 1,3 3,4

FPR [%] 7.14 8 11.49 17.97 14.63

VR [%] 94.59 96.84 98.19 95.57 97.28

ASR [%] 91.85 54.82 37.66 35.58 36.76

We compare the performance of DSC to the conven-

tional solution of GNN (20). Since the WARD does

not purposely contain outliers, we did not use any out-

lier rejection rule in searching for nearest neighbors,

which could be difficult to tune when the available ac-

tion features change on-the-fly. Table 4 shows the per-

formance of the algorithm. Compared with Table 2,

we observe that there is no improvement w.r.t. classi-

fication using all five sensors. In fact, the accuracy in

Table 4 is lower than the accuracy of 90.5% in Table

2 using majority-voting. This result demonstrates the

dependency of NN-type algorithms toward the (dense)

distribution of training examples in a high-dimensional

data space. Compared with Table 3, GNN also under-

performs DSC. For example, DSC outperforms GNN

by about 6% using Sensors (1, 3, 4), and about 9% us-

ing Sensors (1, 3).

Table 4

Performance of GNN measured by false positive rate (FPR), verifi-

cation rate (VR), and active sensor rate (ASR).

Sen # 1-5 1,3,4 1,4 1,3 3,4

FPR [%] 10.64 14.54 13.93 26.88 18.27

ASR [%] 100 60 40 40 40

We further analyze the classification between differ-

ent action categories. Table 5 shows a confusion table

of the DSC results using all the five sensors in accu-

racy percentage. The confusion table clearly indicates

several action categories that mostly contribute to the

false positive rate.

1. We observe that three action categories, i.e., ST,

SI, and LI, have the highest misclassification.

Particularly, it is difficult to differentiate between

standing and sitting in the WARD database us-

ing both DSC and NN (whose confusion table is

not shown in this paper). We argue that the prob-

lem is mainly contributed by the choice of the lo-

tain outlying actions, hence FPR is equal to one minus the accuracy

percentage.


cations of the two low-body sensors at the ankle

locations, because human subjects do not have

to move the ankles to perform both standing and

sitting actions, and inherently the change of the

orientation of the waist sensor is also small be-

tween standing and sitting. To improve the clas-

sification of the three action categories, one so-

lution could be to introduce new sensor locations

around the knees and the thighs.7

2. Between the actions WF, WL, and WR, the al-

gorithm in fact performs better than once would

have expected, because the difference of the three

actions is small. For example, 2.5% of the WF

action is misclassified as WL, 1.6% misclassi-

fied as WR, and furthermore 2.3% misclassified

as PU. These are all actions that are similar in

nature.

3. Despite the similarity of local motions between

PU and several other motions, the recognition

of PU is quite accurate. The last row of Table

5 shows that about 0.1% to 0.3% test samples

are misclassified as 10 of the other 12 categories.

Nevertheless, the true positive rate of PU is above

98%.

4.2. Classification with Different Rejection

Thresholds

In this experiment, we test the effect of different lo-

cal rejection threshold τ1 on the global classification.

During the experiment, the global rejection threshold

τ2 is fixed at 0.08. Table 6 shows the performance

of the DSC algorithm. First, naturally ASR decreases

as τ1 increases. Particularly, compared to ASR =91.85% when τ1 = 0.08, the rate is reduced to 45.58%

when τ1 = 0.18, which means in average less than half

of the sensors transmit action features during the ex-

periment. With more than half of the sensors inactive

in the network to conserve power consumption, the ex-

periment shows that DSC still achieves below 8% FPR

globally, and VR is above 88%. The result corrobo-

rates the design principle of the DSC algorithm that the

distributed classification framework via sparse repre-

sentation is capable of effectively reducing the power

consumption on communication yet at the same time

perserving highly accurate recognition accuracy.

7In a previous study [28], we also suggested that the sensors

placed at the ankle locations tend to provide less action information

than the other conventional locations such as the knees and the waist.

Table 6

Recognition accuracy of DSC with different local rejection thresholds.

τ1 0.08 0.12 0.18

ASR [%] 91.85 72.19 45.58

FPR [%] 7.14 7.58 7.96

VR [%] 94.59 91.03 88.33

5. Conclusion and Discussion

Inspired by the emerging compressive sensing the-

ory, we have proposed a distributed algorithm, i.e., dis-

tributed sparsity classifier (DSC), to classify human

actions/activities on a wearable motion sensor net-

work. The framework provides a unified solution based

on `1-minimization to classify valid action segments

and reject outlying actions on the sensor nodes and the

base station. We have shown through our experiment

that a set of 13 action classes can be accurately repre-

sented and classified using a set of 40-D LPP features

measured at multiple body locations. The proposed

global classifier can adaptively adjust the global opti-

mization to boost the recognition upon available local

measurements. To corroborate the validity of the algo-

rithm, and to safeguard the reproducibility of the sys-

tem performance, we have published an open bench-

mark database called WARD with this paper. The high

recognition accuracy on the WARD database indicates

that DSC should be able to classify other action cat-

egories such as falling, bicycling, and hand motions

with similar high accuracy.

One important observation w.r.t. to the choice of

sensor locations on the human body is that the mo-

tion measurements from the ankle locations may not

discriminate certain categories of upper-body motions

and even lower-body motions. We have suggested to

replace the ankle locations with other locations around

the knees and thighs in order to improve the classi-

fication. Another limitation in the current system and

most other body sensor systems is that the wearable

sensors need to be firmly positioned at the designated

locations. However, a more practical system/algorithm

should tolerate certain degrees of shift without sacri-

ficing the accuracy. In this case, the variation of the

measurement for different action classes would in-

crease substantially. One open question is what low-

dimensional linear/nonlinear models one may use to

model such more complex data, and whether the sparse

representation framework can still apply to approxi-

mate such structures with limited numbers of training

examples. A potential solution to this question will be


Table 5

Confusion table of the 13 action classes for DSC using sensors 1–5

(in percentage).

1 2 3 4 5 6 7 8 9 10 11 12 13

1 (ST) 87.2 10.2 0.7 0 0 0 0.1 1.8 0 0 0 0 0

2 (SI) 25.2 66.8 6.8 0 0 0 0.1 0.1 0 0.1 0 0.1 0.7

3 (LI) 2.6 5.1 91.8 0 0 0 0 0 0 0 0 0.1 0.3

4 (WF) 0 0 0 92 2.5 1.6 0.2 0.2 0.4 0.7 0 0.2 2.3

5 (WL) 0.1 0 0 0.2 97.3 0 0.6 0.3 0.3 0.1 0.1 0.2 1

6 (WR) 0 0 0 0.1 0.1 95.7 0.2 0.4 0.4 0.4 0.5 0.2 2

7 (TL) 0 0 0 0 0.6 0 97 2.3 0 0 0 0 0.1

8 (TR) 0 0 0 0 0 1.6 3.1 95.2 0 0 0 0 0

9 (UP) 0 0 0 0 0 0 0 0 98 0.1 1.6 0.1 0.2

10 (DO) 0 0 0 0.2 0.1 0 0 0 0.1 98.3 0 0.5 0.8

11 (JO) 0 0 0 0 0 0 0 0 0.5 0 99.3 0.1 0.1

12 (JU) 0.1 0 0 0 0 0 0 0 0.3 0.6 0.5 97.9 0.5

13 (PU) 0.3 0.1 0 0.1 0.2 0.1 0.1 0.1 0 0.2 0.2 0.1 98.6

a meaningful step forward both in theory and in prac-

tice.

Acknowledgments

We would like to thank Sameer Iyengar, Victor Shia,

and Posu Yan at the University of California, Berkeley,

Dr. Philip Kuryloski at the Cornell University, Kather-

ine Gilani at the University of Texas at Dallas, Ville-

Pekka Seppa at the Tampere University of Technology,

Finland, and Dr. Marco Sgroi and Roberta Giannanto-

nio at Telecom Italia for their kindly help in design-

ing the wearable motion sensor system and the WARD

database.

References

[1] R. Aylward and J. Paradiso, A compact, high-speed, wearable

sensor network for biomotion capture and interactive media,

Proceedings of the International Conference on Information

Processing in Sensor Networks, 380–389, 2007.

[2] L. Bao and S. Intille, Activity recognition from user-annotated

acceleration data, Proceedings of the International Conference

on Pervasive Computing, 1–17, 2004.

[3] P. Barralon, N. Vuillerme, and N. Noury, Walk detection with a

kinematic sensor: Frequency and wavelet comparison, Proceed-

ings of the 28th IEEE EMBS Annual International Conference,

1711–1714

[4] A. Benbasat and J. Paradiso, Groggy wakeup - automated gener-

ation of power-efficient detection hierarchies for wearable sen-

sors, Proceedings of International Workshop on Wearable and

Implantable Body Sensor Networks, 2007.

[5] E. Candès, Compressive sampling, Proceedings of the Interna-

tional Congress of Mathematicians, 1–20, 2006.

[6] E. Candès and T. Tao, Near-optimal signal recovery from ran-

dom projections: Universal encoding strategies?, IEEE Trans-

actions on Information Theory, vol. 52, No. 12, 5406–5425,

2006.

[7] J. Chen, K. Kwong, D. Chang, J. Luk, and R. Bajcsy, Wear-

able sensors for reliable fall detection, Proceedings of the IEEE

Engineering in Medicine and Biology Conference, 3551–3554,

2005.

[8] T. Choudhury, S. Consolvo, B. Harrison, J. Hightower,

A. LaMarca, L. LeGrand, A. Rahimi, A. Rea, G. Borriello,

B. Hemingway, P. Klasnja, K. Koscher, J. Landay, J. Lester, and

D. Wyatt, The mobile sensing platform: An embedded activity

recognition system, Pervasive Computing, 32–41, 2008.

[9] T. Degen, H. Jaeckel, M. Rufer, and S. Wyss, SPEEDY: A fall

detector in a wrist watch, Proceedings of the IEEE International

Symposium on Wearable Computers, 184–187, 2003.

[10] D. Donoho, Neighborly polytopes and sparse solution of un-

derdetermined linear equations, (preprint) 2005.

[11] , D. Donoho and M. Elad, On the stability of the basis pursuit

in the presence of noise, Signal Processing, vol. 86, 511–532,

2006.

[12] J. Farringdon, A. Moore, N. Tilbury, J. Church, and

P. Biemond, Wearable sensor badge & sensor jacket for con-

text awareness, Proceedings of the International Symposium on

Wearable Computers, 107–113, 1999.

[13] X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang, Face recogni-

tion using Laplacianfaces, IEEE Trans. on Pattern Analysis and

Machine Intelligence, vol. 27, no. 3, 328–340, 2005.

[14] E. Heinz, K. Kunze, and S. Sulistyo, Experimental evaluation

of variations in primary features used for accelerometric con-

text recognition, Proceedings of the European Symposium on

Ambient Intelligence, 252–263, 2003.

[15] T. Huynh and B. Schiele, Analyzing features for activity recog-

nition, Proceedings of the Joint Conference on Smart Objects

and Ambient Intelligence, 159–163, 2005.


[16] H. Kemper and R. Verschuur, Validity and reliability of pe-

dometers in habitual activity research, European Journal of Ap-

plied Physiology, vol. 37, No. 1, 71–82, 1977.

[17] N. Kern, B. Schiele, and A. Schmidt, Multi-sensor activity con-

text detection for wearable computing, Proceedings of the Euro-

pean Symposium on Ambient Intelligence, 220–232, 2003.

[18] P. Lukowicz, J. Ward, H. Junker, M. Stäger, G. Tröster,

A. Atrash, and T. Starner, Recognizing workshop activity using

body worn microphones and accelerometers, Proceedings of the

International Conference on Pervasive Computing, 18–32, 2004.

[19] J. Mantyjarvi, J. Himberg, and T. Seppanen, Recognizing hu-

man motion with multiple acceleration sensors, Proceedings of

the IEEE International Conference on Systems, Man and Cyber-

netics, 747–752, 2001.

[20] T. Martin, B. Majeed, B. Lee, and N. Clarke, Fuzzy ambient in-

telligence for next generation telecare, Proceedings of the IEEE

International Conference on Fuzzy Systems, 894–901, 2006.

[21] M. Mathie, A. Coster, N. Lovell, and B. Celler, Accelerometry:

Providing an integrated, practical method for long-term, ambu-

latory monitoring of human movement, Physiological Measure-

ment, vol. 25, R1–R20, 2004.

[22] J. Morrill, Distributed recognition of patterns in time series

data, Communications of the ACM, vol. 41, No. 5, 45–51, 1998.

[23] B. Najafi, K. Aminian, A. Parschiv-Ionescu, F. Loew, C. Büla,

and P. Robert, Ambulatory system for human motion analysis

using a kinematic sensor: Monitoring of daily physical activity

in the elderly, IEEE Transactions on Biomedical Engineering,

vol. 50, No. 6, 711-723, 2003.

[24] I. Pappas, T. Keller, S. Mangold, M. Popovic, V. Dietz, and

M. Morari, A reliable gyroscope-based gait-phase detection

sensor embedded in a shoe insole, IEEE Sensors Journal, vol. 4,

No. 2, 268–274, 2004.

[25] S. Pirttikangas, K. Fujinami, and T. Nakajima, Feature selec-

tion and activity recognition from wearable sensors, Proceed-

ings of the International Symposium on Ubiquitous Computing

Systems, 2006.

[26] C. Sadler and M. Martonosi, Data compression algorithms for

energy-constrained devices in delay tolerant networks, Proceed-

ings of the ACM Conference on Embedded Networked Sensor

Systems, 265–278, 2006.

[27] A. Sixsmith and N. Johnson, A smart sensor to detect the falls

of the elderly, Pervasive Computing, 42–47, 2004.

[28] A. Yang, R. Jafari, P. Kuryloski, S. Iyengar, S. Sastry, and

R. Bajcsy, Distributed segmentation and classification of human

actions using a wearable sensor network, Proceedings of the

CVPR Workshop on Human Communicative Behavior Analy-

sis, 2008.

[29] G. Williams, K. Doughty, K. Cameron, and D. Bradley, A smart

fall and activity monitor for telecare applications, Proceedings

of the IEEE International Conference in Medicine and Biology

Society, 1998.

[30] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, Ro-

bust face recognition via sparse representation, (in press) IEEE

Transactions on Pattern Analysis and Machine Intelligence,

2008.

[31] W. Zhang, L. He, Y. Chow, R. Yang, and Y. Su, The study on

distributed speech recognition system, Proceedings of the IEEE

International Conference on Acoustics, Speech, and Signal Pro-

cessing, 1431–1434, 2000.

Distributed Segmentation and Classification of Human Actions

Using a Wearable Motion Sensor Network∗

Allen Y. Yang, Sameer Iyengar,

Shankar Sastry, Ruzena Bajcsy

Department of EECS

University of California, Berkeley

Philip Kuryloski

Department of ECE

Cornell University

Roozbeh Jafari

Department of EE

University of Texas, Dallas

Abstract

We propose a distributed recognition method to classify

human actions using a low-bandwidth wearable motion sen-

sor network. Given a set of pre-segmented motion sequences

as training examples, the algorithm simultaneously segments

and classifies human actions, and it also rejects outlying ac-

tions that are not in the training set. The classification is

distributedly operated on individual sensor nodes and a base

station computer. We show that the distribution of multiple

action classes satisfies a mixture subspace model, one sub-

space for each action class. Given a new test sample, we

seek the sparsest linear representation of the sample w.r.t. all

training examples. We show that the dominant coefficients in

the representation only correspond to the action class of the

test sample, and hence its membership is encoded in the rep-

resentation. We further provide fast linear solvers to compute

such representation via `1-minimization. Using up to eight

body sensors, the algorithm achieves state-of-the-art 98.8%

accuracy on a set of 12 action categories. We further demon-

strate that the recognition precision only decreases grace-

fully using smaller subsets of sensors, which validates the

robustness of the distributed framework.

1. Introduction

We study human action recognition using a distributed

wearable motion sensor network. Action recognition has

been studied to a great extent in computer vision in the past.

Compared with a model-based or appearance-based vision

system, the body sensor network approach has the following

advantages: 1. The system does not require to instrument the

environment with cameras or other sensors. 2. The system

has the necessary mobility to support continuous monitoring

∗Corresponding author: [email protected]. This work

was partially supported by ARO MURI W911NF-06-1-0076, NSF TRUST

Center, and the startup funding from the University of Texas and Texas In-

struments.

of a subject during her daily activities. 3. With the continuing

miniaturization of mobile processors and sensors, it has be-

come possible to manufacture wearable sensor networks that

densely cover the human body to record and analyze very

small movements of the human body (e.g., breathing and

spine movements). Such sensor networks can be used in ap-

plications such as medical-care monitoring, athlete training,

tele-immersion, and human-computer interaction (e.g., inte-

gration of accelerometers in Wii game controllers and smart

phones).

Figure 1. A wireless body sensor system.

In traditional sensor networks, the computation carried by

the sensor board is fairly simple: Extract certain local in-

formation and transmit the data to a computer server over

the network for processing. In this paper, we propose a new

method for distributed pattern recognition. In such system,

each sensor node will be able to classify local, albeit biased,

information. Only when the local classification detects a pos-

sible object/event does the sensor node become active and

transmit the measurement to the server.1 On the server side,

a global classifier receives data from the sensor nodes and

further optimizes the classification. The global classifier can

1Studies have shown that the power consumption required to success-

fully send one byte over a wireless channel is equivalent to executing be-

tween 1e3 and 1e6 instructions on an onboard processor [18]. Hence it is

paramount in sensor networks to reduce the communication cost while pre-

serve the recognition performance.

1

be more computationally involved than the distributed clas-

sifiers, but it has to adapt to the change of available network

sensors due to local measurement error, sensor failure, and

communication congestion.

Past studies on sensor-based action recognition were pri-

marily focused on single accelerometers [8, 10] or other mo-

tion sensors [11, 16]. More recent systems prefer using mul-

tiple motion sensors [1, 2, 9, 12–14, 17]. Depending on the

type of sensor used, an action recognition system is typically

composed of two parts: a feature extraction module and a

classification module.

There are three major directions for feature extraction in

wearable sensor networks. The first direction uses simple

statistics of a signal sequence such as the max, mean, vari-

ance, and energy. The second type of feature is computed

using fixed filter banks such as FFT and wavelets [10, 16].

The third type is based on classical dimensionality reduc-

tion techniques such as principal component analysis (PCA)

and linear discriminant analysis (LDA) [13, 14]. In terms of

classification on the action features, a large body of previ-

ous work favored thresholding or k-nearest-neighbor (kNN)

due to the simplicity of the algorithms for mobile devices

[10, 16, 17]. Other more sophisticated techniques have also

been used, such as decision trees [2, 3] and hidden Markov

models [13].

For distributed pattern recognition, there exist studies on

distributed speech recognition [20] and distributed expert

systems [15]. One particular problem associated with most

distributed sensor systems is that each local observation from

the distributed sensors is biased and insufficient to classify

all classes. For example in our system, the sensors placed

on the lower-body would not perform well to classify those

actions that mainly involve upper body motions, and vice

versa. Consequently, traditional majority-voting type clas-

sifiers may not achieve the best performance globally.

Design of the wearable sensor network. Our wearable

sensor network consists of sensor nodes placed at various

body locations, which communicate with a base station at-

tached to a computer server through a USB port. The sen-

sor nodes and base station are built using the commercially

available Tmote Sky boards. Tmote Sky runs TinyOS on an

8MHz microcontroller with 10K RAM and communicates

using the 802.15.4 wireless protocol. Each custom-built sen-

sor board has a triaxial accelerometer and a biaxial gyro-

scope, which is attached to Tmote Sky (shown in Fig 2).

Each axis is reported as a 12bit value to the node, indicating

values in the range of ±2g and ±500◦/s for the accelerome-

ter and gyroscope, respectively.

To avoid packet collision in the wireless channel, we use a

TDMA protocol that allocates each node a specific time slot

during which to transmit data. This allows us to receive sen-

sor data at 20Hz with minimal packet loss. To avoid drift in

Figure 2. The sensor board with the accelerometer and gyroscope.

The mother board at the back is Tmote Sky.

the network, the base station periodically broadcasts a packet

to resynchronize the nodes’ individual timers. The code to

interface with the sensors and transmit data is implemented

directly on the mote using nesC, a variant of C.

Problem definition. Assume a set of L wearable sen-

sor nodes with triaxial accelerometers and biaxial gyro-

scopes are attached to the human body. Denote al(t) =(xl(t), yl(t), zl(t), θl(t), ρl(t))

T ∈ R5 as the measurement

of the five sensors on node l at time t, and a(t) =(aT

1 (t),aT2 (t), · · · ,aT

L(t))T ∈ R5L collects all sensor mea-

surement. Denote s = (a(1),a(2), · · · ,a(l)) ∈ R5L×l as

an action sequence of length l.Given K different classes of human actions, a set of ni

training examples {si,1, · · · , si,ni} are collected for each ith

class. The durations of the sequences naturally may be differ-

ent. Given a new test sequence s that may contain multiple

actions and possible other outlying actions, we seek a dis-

tributed algorithm to simultaneously segment the sequence

and classify the actions.

Solving this problem mainly involves the following chal-

lenges:

1. Simultaneous segmentation and classification. We seek

simultaneous segmentation and recognition from a long

motion sequence. Furthermore, we also assume that the

test sequence may contain other unknown actions that

are not from the K classes. The algorithm needs to be

robust to these outliers.

2. Variation of action durations. One major difficulty in

segmentation of actions is to determine the duration of

a proper action. In practice, the durations of different

actions vary dramatically (see Fig 3).

Figure 3. Population of different action durations in our data set.

Figure 4. Readings of the x-axis accelerometers (top) and x-axis gyroscopes (bottom) from 8 distributed sensors (shown in different colors)

on two repetitive “stand-kneel-stand” sequences from two subjects as the left and right columns.

3. Identity independence. In addition to the variation of

action durations, different people act differently for the

same actions (see Fig 4). For a test sequence in the ex-

periment, we examine the identity-independent perfor-

mance by excluding the training samples of the same

subject.

4. Distributed recognition. A distributed recognition sys-

tem needs to further consider the following issues: 1.

How to extract compact and accurate low-dimensional

action features for local classification and transmission

over a band-limited network? 2. How to classify the lo-

cal measurement in real time using low-power proces-

sors? 3. How to design a classifier to globally optimize

the recognition and be adaptive to the change of the net-

work?

Contributions of the paper. We propose a distributed ac-

tion recognition algorithm that simultaneously segments and

classifies 12 human actions using up to 8 wearable motion

sensors. The work is inspired by an emerging theory of

compressed sensing and sparse representation [4, 5]. We as-

sume each action class satisfies a low-dimensional subspace

model. We show that a 10-D LDA feature space suffices to

locally represent the 12 action subspaces on each node. If a

linear representation is sought to represent a valid test sam-

ple w.r.t. all training samples, the dominant coefficients in

the sparsest representation correspond to the training sam-

ples from the same action class, and hence they encode the

membership of the test sample. The implementation of the

system consists of three integrated components: 1. Multi-

resolution action feature extraction. 2. Fast distributed clas-

sifiers via `1-minimization. 3. An adaptive global classifier.

The method can accurately segment and classify human ac-

tions from a continuous motion sequence. The local classi-

fiers that reject potential outliers reduce the sensor-to-server

communication to about 50%. One can also choose to ac-

tivate only a subset of the sensors on the fly due to sensor

failure or network congestion. The global classifier is able to

adaptively update the optimization process and improve the

overall classification upon available local decisions.

Finally, the research of action recognition using wearable

sensors in pattern recognition has been hindered to an extent

by a lack of rigorous and public database/benchmark in or-

der to judge the performance and safeguard the reproducibil-

ity of extant algorithms. We intend to address this issue

by constructing and maintaining a public benchmark system

called “Wearable Action Recognition Database” (WARD).

The database will contain more human subjects across multi-

ple age groups, and it will be made available on our website.

2. Classification via Sparse Representation

We first present an efficient action classification method

on each sensor node assuming action sequences are pre-

segmented. Given an action segment of length l from node j,

sj = (aj(1),aj(2), · · · ,aj(l)) ∈ R5×l, define a new vector

sSj as the stacking of the l columns of sj :

sSj

.= (ai(1)T ,ai(2)T , · · · ,ai(l)

T )T ∈ R5l. (1)

We will interchangeably use sj to denote the stacked vector

sSj without causing ambiguity.

Since the length l varies among different subjects and ac-

tions, we need to normalize l to be the same for all the train-

ing and test samples, which can be achieved by linear inter-

polation or FFT interpolation. After normalization, we de-

note the dimension of samples sj as Dj = 5l. Subsequently,

we define a full-body action vector v that stacks the measure-

ment from all L nodes:

v = (sT1 , sT

2 , · · · , sTL)T ∈ R

D, (2)

where D = D1 + · · · + DL = 5lL.

In this paper, we assume the samples v in an action

class satisfy a subspace model, called an action subspace.

If the training samples {v1, · · · ,vni} of the ith class suf-

ficiently span the ith action subspace, given a test sample

y = (yT1 , · · · ,yT

L)T ∈ RD in the same class i, y can be

linearly represented using the training examples of the same

class:

y = α1v1 + · · · + αnivni

⇔

y1

y2

...yL

=

s1

s2

...sL

1

· · ·

s1

s2

...sL

ni

α1

α2

...αni

.(3)

It is important to note that such linear constraint also holds

on each node j: yj = α1sj,1 + · · · + αnisj,ni

∈ RDj .

In theory, complex data such as human actions typically

constitute complex nonlinear models. The linear models are

used to approximate such nonlinear structures in a higher-

dimensional subspace (see Fig 5). Notice that such lin-

ear approximation may not produce good estimation of the

distance/similarity metric for the samples on the manifold.

However, as we will show in Example 1, given sufficient

samples on the manifold as training examples, a new test

sample can be accurately represented on the subspace, pro-

vided that any two classes do not have similar subspace mod-

els.

Figure 5. Modeling a 1-D manifold M using a 2-D subspace V .

To recover label(y), a previous study [19] proposed to

reformulate the recognition using a global sparse represen-

tation: Since label(y) = i is unknown, we can represent y

using all the training samples from all K classes:

y = (A1, A2, · · · , AK)

x1

x2

...xK

= Ax, (4)

where Ai = (vi,1,vi,2, · · · ,vi,ni) ∈ R

D×ni collects all the

training samples of class i, xi = (αi,1, αi,2, · · · , αi,ni)T ∈

Rni collects the corresponding coefficients in (3), and A ∈

RD×n where n = n1 + n2 + · · · + nK . Since y satisfies

both (3) and (4), one solution of x in (4) should be x∗ =

(0, · · · , 0,xTi , 0, · · · , 0)T . The solution is naturally sparse:

in average only 1K

terms in x∗ are nonzero.

On each sensor j, solution x∗ of (4) is also a solution for

the representation:

yj = (A(j)1 , A

(j)2 , · · · , A

(j)K )x = A(j)

x, (5)

where A(j)i ∈ R

Dj×ni consists of row vectors in Ai that

correspond to the jth node. Hence, x∗ can be solved ei-

ther globally using (4) or locally using (5), provided that the

action data measured on each node are sufficiently discrimi-

nant. We will come back to the discussion about local clas-

sification versus global classification in Section 3. In the rest

of this section however, our focus will be on each node.

One major difficulty in solving (5) is the high dimension-

ality of the action data. In compressed sensing [4, 5], one

reduces the dimension of a linear system by choosing a lin-

ear projection Rj ∈ Rd×Dj :2

yj

.= Rjyj = RjA

(j)x

.= A(j)

x ∈ Rd. (6)

After projection Rj , typically the feature dimension d is

much smaller than the number n of all training samples.

Therefore, the new linear system (6) is underdetermined. Nu-

merically stable solutions exist to uniquely recover sparse so-

lutions x∗ via `1-minimization [6]:


x. (7)

In our experiment, we have tested multiple projection op-

erators including PCA, LDA, and random project studied

in [19]. We found that 10-D feature spaces using LDA lead

to best recognition in a very low-dimensional space.

After the (sparsest) representation x is recovered, we

project the coefficients onto each action subspaces

δi(x) = (0, · · · , 0,xTi , 0, · · · , 0)T ∈ R

n, i = 1, · · · , K.(8)

Finally, the membership of the test sample yj is assigned to

the class with the smallest residual


‖yj − A(j)δi(x)‖2. (9)

Example 1 (Classification on Nodes) We designed 12 ac-

tion categories in the experiment: Stand-to-Sit, Sit-to-

Stand, Sit-to-Lie, Lie-to-Sit, Stand-to-Kneel, Kneel-to-Stand,

Rotate-Right, Rotate-Left, Bend, Jump, Upstairs, and Down-

stairs. The detailed experiment setup is given in Section 4.

To implement `1-minimization on the sensor node, we

look for fast sparse solvers in the literature. We have tested

a variety of methods including (orthogonal) matching pur-

suit (MP), basis pursuit (BP), LASSO, and a quadratic log-

barrier solver.3 We found that BP [7] gives the best trade-off

between speed, noise tolerance, and recognition accuracy.

Here we demonstrate the accuracy of the BP-based algo-

rithm on each sensor node (see Fig 1 for their locations). The

actions are manually segmented from a set of long motion se-

quences from three subjects. In total there are 626 samples

in the data set. The 10-D feature selection is via LDA. We re-

quire the classification to be identity-independent. The accu-

racy of the classification is shown in Table 1. Fig 6 shows an

example of the estimated sparse coefficients x and its resid-

uals. In terms of the speed, our simulation in MATLAB takes

in average 0.03s to process one test sample on a typical 3G

Hz PC.

2Notice that Rj is not computed on the sensor node. These matrices are

computed offline and simply stored on each sensor node.3The implementation of these routines in MATLAB is available via

SparseLab: http://sparselab.stanford.edu

Table 1. Recognition accuracy on each node over 12 action classes.

Sen # 1 2 3 4 5 6 7 8

Acc [%] 99.9 99.4 99.9 100 95.3 99.5 93 100

Figure 6. Left: Sparse `1 solution by BP for a Stand-to-Sit action

on the waist node. Right: Corresponding residuals. The action is

correctly classified as Class 1. SCI(x) = 0.7 (see (10)).

Example 1 shows that if the segmentation of the actions

is known and there is no other invalid samples, all sensor

nodes can recognize the 12 actions individually with very

high accuracy, which also verifies that the mixture subspace

model is a good approximation of the action data. Neverthe-

less, one may question that in such low-dimensional feature

spaces other classical methods (e.g., kNN and decision tree

methods) should also perform well. In the next section, we

will show that the major advantage of adopting the sparse

representation framework is a unified solution to recognize

and segment valid actions and reject invalid ones. We will

also show that the method is adaptive to the change of avail-

able sensor nodes on the fly.

3. Distributed Segmentation and Recognition

We start by introducing multi-resolution action segmen-

tation on each sensor node. From the training examples,

we can estimate a range of possible lengths for all actions

of interest. We then evenly divide the range into multi-

ple length hypotheses: (h1, · · · , hs). At each time t in a

motion sequence, the node tests a set of s possible seg-

mentations: y(1) = (a(t − h1), · · · , a(t)), · · · ,y(s) =(a(t − hs), · · · , a(t)), as shown in Fig 7.4 With each candi-

date y again normalized to length l, a sparse representation

x is estimated using `1-minimization in Section 2.

Figure 7. Multiple segmentation hypotheses on a wrist sensor at

time t = 150 of a “go downstairs” sequence. h1 is a good segment

while others are false segments. Notice that the movement between

250 and 350 is an outlying action that the subject performed.

Based on the previous sparsity assumption, if y is not a

4Those segmentation hypotheses that overlap with previously detected

actions will be ignored to avoid temporal ambiguity.

valid segmentation w.r.t. the training examples due to either

incorrect t or h, or the real action performed is not in the

training classes, the dominant coefficients of its sparsest rep-

resentation x should not correspond to any single class. We

use a sparsity concentration index (SCI) [19]:

SCI(x).=

K · maxj=1,··· ,K ‖δj(x)‖1/‖x‖1 − 1

K − 1∈ [0, 1].

(10)

If the nonzero coefficients of x are evenly distributed among

K classes, then SCI(x) = 0; if all the nonzero coefficients

are associated with a single class, then SCI(x) = 1. There-

fore, we introduce a sparsity threshold τ1 applied to all sen-

sor nodes: If SCI(x) > τ1, the segment is a valid local

measurement, and its 10-D LDA features y will be sent to

the base station.

Figure 8. A invalid representation (SCI=0.13).

Next, we introduce a global classifier that adaptively op-

timizes the overall segmentation and classification. Sup-

pose at time t and with a length hypothesis h, the base

station receives L′ action features from the active sensors

(L′ ≤ L). Without loss of generality, assume these fea-

tures are from the first L′ sensors: y1, y2, · · · , yL′ . Let

y′ = (yT

1 , · · · , yTL′)T ∈ R

10L′

. Then the global sparse rep-

resentation x of y′ satisfies the following linear system

y′ =

(

R1 ··· 0 ··· 0

.... . .

......

0 ··· RL′ ··· 0

)

Ax = R′Ax = A′x, (11)


×D is a new projection matrix that only

extracts the action features from the first L′ nodes. Conse-

quently, the effect of changing active sensor nodes for the

global classification is formulated via the global projection

matrix R′. During the transformation, the data matrix A and

the sparse representation x remain unchanged. The linear

system (6) then becomes a special case of (11) when L′ = 1.

Similar to the outlier rejection criterion we previously

proposed on each node, we introduce a global rejection

threshold τ2. If SCI(x) > τ2 in (11), the most significant

coefficients in x are concentrated in a single training class.

Hence y′ is assigned to that class, and its length hypothesis

h provides the segmentation of the action from the motion

sequence.

The overall algorithm on the nodes and on the network

server provides a unified solution to segment and classify ac-

tion segments from a motion sequence using only two simple

parameters τ1 and τ2. Typically τ1 is selected to be less re-

stricted than τ2 in order to increase the recall rate, because

passing certain amounts of false signal to the global classi-

fier is not necessarily disastrous as the signal would be re-

jected by τ2 when the action features from multiple nodes

are jointly considered. The formulation of adaptive classifi-

cation (11) via a global projection matrix R′ and two spar-

sity constraints τ1 and τ2 provides a simple means of reject-

ing outliers from a network of multiple sensors. The method

compares favorably to other classical methods such as kNN

and decision trees, because these methods need to train mul-

tiple thresholds and decision rules when the number L′ and

the set of available sensors vary in the full-body action vector

y′ = (yT

1 , · · · , yTL′)T .

Finally, we consider how the change of active nodes af-

fects `1-minimization and the classification of the actions. In

compressed sensing, the efficacy of `1-minimization in solv-

ing for the sparsest solution x in (11) is characterized by the

`0/`1 equivalence relation [6, 7]. A necessary and sufficient

condition for the equivalence to hold is the k-neighborliness

of A′. As a special case, one can show that if x is the sparsest

solution in (11) for L′ = L, x is also a solution for L′ < L.

Hence, the decrease of L′ leads to possible sparser solutions

of x.

On the other hand, the decrease in available action fea-

tures also makes y′ less discriminant. For example, if we

reduce L′ = 1 and only activate a wrist sensor, then the `1-

solution x may have nonzero coefficients associated to mul-

tiple actions with similar wrist motions, albeit sparser. This

is an inherent problem for any method to classify human ac-

tions using a limited number of motion sensors. In theory,

if two action subspaces in a low-dimensional feature space

have a small subspace distance after the projection, the cor-

responding sparse representation cannot distinguish the test

samples from the two classes. We will demonstrate in Sec-

tion 4 that indeed reducing the available motion sensors will

reduce the discriminant power of the sparse representation in

a lower-dimensional space.

4. Experiment

We validate the performance of the system using a data

set we collected from three male subjects at the age of 28, 30,

and 32, respectively. Eight wearable sensors were placed at

different body locations (see Fig 1). We designed a set of 12

action classes: Stand-to-Sit (StSi), Sit-to-Stand (SiSt), Sit-to-

Lie (SiLi), Lie-to-Sit (LiSi), Stand-to-Kneel (StKn), Kneel-

to-Stand (KnSt), Rotate-Right (RoR), Rotate-Left (RoL),

Bend, Jump, Upstairs (Up), and Downstairs (Down). We

are particularly interested in testing the system under various

action durations. For this purpose, we have asked the sub-

jects to perform StSi, SiSt, SiLi, and LiSi with two differ-

ent speeds (slow and fast), and perform RoR and RoL with

two different rotation angles (90◦ and 180◦). All subjects

were asked to perform a sequence of related actions in each

recording session based on their own interpretation of the ac-

tions. In total there are 626 actions performed in the data set

(see Table 3 for the numbers in individual classes).

Table 2 shows Precision versus Recall of the algorithm

with different active sensor nodes. For all experiments,

τ1 = 0.2 and τ2 = 0.4. When all nodes are activated, the

algorithm can achieve 98.8% accuracy among the actions it

extracted, and 94.2% of the true actions are detected. The

performance decreases gracefully when more nodes become

unavailable to the global classifier. Our results show that if

we can maintain one motion sensor on the upper body (e.g.,

at position 2) and one on the lower body (e.g., at position 7),

the algorithm can still achieve 94.4% precision and 82.5%recall. Finally, in average the 8 distributed classifiers that

reject invalid local measurements reduce the node-to-station

communication for above 50%.

Table 2. Precision vs. recall with different sets of activated sensors.Sensors 2 7 2,7 1,2,7 1- 3, 7,8 1- 8

Prec [%] 89.8 94.6 94.4 92.8 94.6 98.8

Rec [%] 65 61.5 82.5 80.6 89.5 94.2

One may be curious about the relatively low recall on sin-

gle sensors such as 2 and 7. This performance difference

is due to the large number of potential outlying segments

presented in a long motion sequence (e.g., see Fig 7). We

further compare the difference using two confusion tables

3 and 4. We see that a single node 2 that is positioned on

the right wrist performed poorly mainly on two action cate-

gories: Stand-Kneel and Upstairs-Downstairs, both of which

involve significant movements of the lower body but not the

upper one. This is the main reason for the low recall in Ta-

ble 2. On the other hand, for the actions that are detected

using node 2, our system can still achieve about 90% accu-

racy, which clearly demonstrates the robustness of the dis-

tributed recognition framework. Similar arguments also ap-

ply to node 7 and other sensor combinations.

Table 3. Confusion table using sensors 1-8.

Finally, we provide examples of the classification results

on Subject 1 to demonstrate the accuracy of the proposed al-

gorithm using all 1 - 8 sensor nodes. For clarity, each figure

in Fig 9 - 21 only plots the readings from x-axis accelerome-

ters on the 8 nodes. The segmentation results are then super-

Table 4. Confusion table using sensor 2.

imposed. The black solid boxes indicate the locations of the

correctly classified action segments. The red boxes (e.g., in

Fig 14) indicate the locations of false classification. One can

also observe from the figures that some valid actions are not

detected by the algorithm, e.g., in Fig 13.

Figure 9. Segmentation of a slow Stand-Sit-Stand sequence.

Figure 10. Segmentation of a fast Stand-Sit-Stand sequence.

Figure 11. Segmentation of a slow Sit-Lie-Sit sequence.

Figure 12. Segmentation of a fast Sit-Lie-Sit sequence.

5. Conclusion and Discussion

Inspired by the emerging compressed sensing theory, we

have proposed a distributed algorithm to segment and clas-

sify human actions on a wearable motion sensor network.

Figure 13. Segmentation of a Bend sequence.

Figure 14. Segmentation of a Stand-Kneel-Stand sequence.

Figure 15. Segmentation of a 90◦ Rotate-Right-Left sequence.

Figure 16. Segmentation of a 90◦ Rotate-Left-Right sequence.

Figure 17. Segmentation of a 180◦ Rotate-Right sequence.

Figure 18. Segmentation of a 180◦ Rotate-Left sequence.

Figure 19. Segmentation of a Jump sequence.

Figure 20. Segmentation of a Go-Upstairs sequence.

Figure 21. Segmentation of a Go-Downstairs sequence.

The framework provides a unified solution based on `1-

minimization to classify valid action segments and reject out-

lying actions on the sensor nodes and the base station. We

have shown through our experiment that a set of 12 action

classes can be accurately represented and classified using a

set of 10-D LDA features measured at multiple body loca-

tions. The proposed global classifier can adaptively adjust

the global optimization to boost the recognition upon avail-

able local measurements.

One limitation in the current system and most other body

sensor systems is that the wearable sensors need to be firmly

positioned at the designated locations. However, a more

practical system/algorithm should tolerate certain degrees

of shift without sacrificing the accuracy. In this case, the

variation of the measurement for different action classes

would increase substantially. One open question is what low-

dimensional linear/nonlinear models one may use to model

such more complex data, and whether the sparse representa-

tion framework can still apply to approximate such structures

with limited numbers of training examples. A potential solu-

tion to this question will be a meaningful step forward both

in theory and in practice.

References

[1] R. Aylward and J. Paradiso. A compact, high-speed, wearable

sensor network for biomotion capture and interactive media.

In IPSN, 2007.

[2] L. Bao and S. Intille. Activity recognition from user-annotated

acceleration data. In Pervasive, 2004.

[3] A. Benbasat and J. Paradiso. Groggy wakeup - automated

generation of power-efficient detection hierarchies for wear-

able sensors. In Int. Work. on Wearable and Implantable Body

Sensor Networks, 2007.

[4] E. Candes. Compressive sampling. In Proceedings of the In-

ternational Congress of Mathematicians, 2006.

[5] E. Candes and T. Tao. Near-optimal signal recovery from ran-

dom projections: Universal encoding strategies? IEEE Trans.

Information Theory, 52(12):5406–5425, 2006.

[6] D. Donoho. Neighborly polytopes and sparse solution of un-

derdetermined linear equations. preprint, 2005.

[7] D. Donoho and M. Elad. On the stability of the basis pursuit

in the presence of noise. Sig. Proc., 86:511–532, 2006.

[8] J. Farringdon, A. Moore, N. Tilbury, J. Church, and

P. Biemond. Wearable sensor badge & sensor jacket for con-

text awareness. In Int. Symp. on Wear. Comp., 1999.

[9] E. Heinz, K. Kunze, and S. Sulistyo. Experimental evalua-

tion of variations in primary features used for accelerometric

context recognition. In Euro. Symp. on Amb. Intel., 2003.

[10] T. Huynh and B. Schiele. Analyzing features for activity

recognition. In J. Conf. on Smart Objects and Ambient In-

telligence, 2005.

[11] H. Kemper and R. Verschuur. Validity and reliability of pe-

dometers in habitual activity research. European Journal of

Applied Physiology, 37(1):71–82, 1977.

[12] N. Kern, B. Schiele, and A. Schmidt. Multi-sensor activity

context detection for wearable computing. In European Sym-

posium on Ambient Intelligence, 2003.

[13] P. Lukowicz, J. Ward, H. Junker, M. Stager, G. Troster,

A. Atrash, and T. Starner. Recognizing workshop activity us-

ing body worn microphones and accelerometers. In Pervasive,

2004.

[14] J. Mantyjarvi, J. Himberg, and T. Seppanen. Recognizing hu-

man motion with multiple acceleration sensors. In Int. Conf.

on Sys., Man and Cyb., 2001.

[15] J. Morrill. Distributed recognition of patterns in time series

data. Communications of the ACM, 41(5):45–51, 1998.

[16] B. Najafi, K. Aminian, A. Parschiv-Ionescu, F. Loew, C. Bula,

and P. Robert. Ambulatory system for human motion anal-

ysis using a kinematic sensor: Monitoring of daily physical

activity in the elderly. IEEE Transactions on Biomedical En-

gineering, 50(6):711–723, 2003.

[17] S. Pirttikangas, K. Fujinami, and T. Nakajima. Feature selec-

tion and activity recognition from wearable sensors. In Int.

Symp. on Ubi. Comp. Sys., 2006.

[18] C. Sadler and M. Martonosi. Data compression algorithms

for energy-constrained devices in delay tolerant networks. In

ACM Conf. on Emb. Net. Sen. Sys., pages 265–278, 2006.

[19] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust

face recognition via sparse representation. (in press) PAMI,

2008.

[20] W. Zhang, L. He, Y. Chow, R. Yang, and Y. Su. The study on

distributed speech recognition system. In Int. Conf. on Acou.,

Speech, and Sig. Proc., pages 1431–1434, 2000.

DexterNet: An Open Platform for HeterogeneousBody Sensor Networks and Its Applications∗

Philip Kuryloski �,†,◦, Annarita Giani †, Roberta Giannantonio ∆, Katherine Gilani ?,Raffaele Gravina ◦, Ville-Pekka Seppä 2, Edmund Seto ∇, Victor Shia †, Curtis Wang †,

Posu Yan †, Allen Y. Yang †, Jari Hyttinen 2, Shankar Sastry †, Stephen Wicker �, Ruzena Bajcsy †

† Department of EECS, University of California, Berkeley, CA 94720� Department of ECE, Cornell University, Ithaca, NY 14853

∇ School of Public Health, University of California, Berkeley, CA 94720∆ Telecom Italia, Turin, Italy

◦ WSN Lab sponsored by Pirelli and Telecom Italia, Berkeley, CA 947042 Department of Biomedical Engineering, Tampere University of Technology, Tampere, Finland

? Department of EE, University of Texas at Dallas, TX 75080.

ABSTRACT

We design and implement a novel platform, called Dex-terNet, for heterogeneous body sensor networks. Thesystem is motivated by shifting research paradigms tosupport real-time, persistent human monitoring in bothindoor and outdoor environments. The platform adoptsa three-layer, hierarchical architecture to control hetero-geneous body sensors. The first layer, called the bodysensor layer (BSL), deals with design of different wire-less body sensors and their instrumentation on the body.We detail two custom-built body sensors: one measur-ing body motions and the other measuring the ECGand respiratory patterns. At the second layer, calledthe personal network layer (PNL), the wireless bodysensors on a single subject communicate with a mo-bile computer station. The mobile station can be either

∗This work was supported in part by TRUST (The Teamfor Research in Ubiquitous Secure Technology), which re-ceives support from the National Science Foundation (NSFaward number CCF-0424422) and the following organiza-tions: AFOSR (#FA9550-06-1-0244), Cisco, British Tele-com, ESCHER, HP, IBM, iCAST, Intel, Microsoft, ORNL,Pirelli, Qualcomm, Sun, Symantec, Telecom Italia andUnited Technologies. This work was also supported inpart by ARO MURI W911NF-06-1-0076, the Center for In-formation Technology Research in the Interest of Society(CITRIS), and Finnish Funding Agency for Technology andInnovation (Tekes). Corresponding author: P. Kuryloski([email protected]).

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.IPSN ’09 San Francisco, CA USACopyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.

a computer or a smart phone that supports Linux OSand the IEEE 802.15.4 protocol. It issues control com-mands to the body sensors and receives and processessensor data measured by the body sensors. These func-tions are abstracted and implemented as an open-sourcesoftware library, called Signal Processing In Node En-vironment (SPINE). A DexterNet network is scalable,and can be reconfigured on-the-fly via SPINE. At thethird layer, called the global network layer (GNL), mul-tiple PNLs communicate with a remote Internet serverto permanently log the sensor data and support higher-level applications. We demonstrate the versatility ofthe DexterNet platform via three applications: avatarvisualization, human activity recognition, and integra-tion of DexterNet with global positioning sensors andair pollution sensors for asthma studies.

Categories and Subject Descriptors

I.2.9 [Artificial Intelligence]: Robotics—Sensors; D.2.11[Software Engineering]: Software Architectures—Domain-specific architectures

General Terms

Design, Experimentation, Physiological Sensing

Keywords

Sensor Networks, Body Sensing, Wearable Action Recog-nition, DexterNet, SPINE

1. INTRODUCTION

Wireless body sensor networks (BSNs) have been anemerging research area in the field of sensor networks inthe past five years. The rapid development is mainly dueto two reasons: 1. Continuing progress in the integra-

tion and miniaturization of sensors, processors, and ra-dio devices. 2. Rising demand for advanced body sensorsystems from pivotal areas of elderly protection and clin-ical patient monitoring to much broader applications inmilitary, preventive healthcare, and consumer electron-ics. Traditional BSNs mainly involve single wearablesensors, such as fall detection [20, 8, 18, 6], walk andgait-phase detection [14, 3], and pulse-oximetry moni-toring [12, 13]. More sophisticated systems may consistof multiple heterogeneous sensors, adopt certain hier-archical architecture for real-time sensor management,and even integrate body sensors with other environmen-tal sensors. Some examples include CodeBlue [10], Mo-biCare [5], and ALARM-NET [21]. These systems in-strument the human body as an active mobile platform,and have the necessary mobility to support persistentmonitoring in people’s normal living environments.

In this paper, we present a novel platform for hetero-geneous body sensor networks called DexterNet. Thedesign principles of DexterNet are manifold:

1. DexterNet supports an open-source on-node sig-nal processing library, namely, SPINE (Signal Pro-cessing In Node Environment) [19]. To our bestknowledge, SPINE is the only open-source librarythat is versatile enough to support heterogeneousbody sensors. Subsequently, higher-level applica-tions using DexterNet can seamlessly control othertypes of body sensors in the future via the SPINElibrary.

2. Harnessing the rich functionalities in SPINE, Dex-terNet supports real-time signal collection and sen-sor management on a network of heterogeneousbody sensors. The configuration of a DexterNetnetwork can also be modified on the fly. We havedesigned and manufactured two different body sen-sors: one measuring body motions and the othermeasuring the ECG and respiratory patterns. Thesystem can also conveniently integrate other com-mercially available sensor nodes via SPINE, suchas SHIMMER and MICAz.

3. To support long-term monitoring of multiple hu-man subjects in both indoor and outdoor envi-ronments, DexterNet adopts a flexible three-layerBSN architecture. A body sensor layer (BSL) dealswith the design of different sensors and their in-strumentation on the body. A personal networklayer (PNL) manages communication between thewireless body sensors and a mobile computer sta-tion. The mobile station can be either a computeror a smart phone that supports Linux OS and theIEEE 802.15.4 protocol. Finally, a global networklayer (GNL) via the Internet permanently logs thesensor data and supports other higher-level appli-cations on one or more secured network servers.

Figure 1 shows the three-layer architecture of Dex-terNet. At the BSL, the system supports two typesof custom-built wireless wearable sensors. The first isa motion sensor board that consists of a triaxial ac-celerometer and biaxial gyroscope. The second is a bio-logical sensor (biosensor) called Wisepla [16], which in-tegrates an electrical impedance pneumography (EIP),an electrocardiogram (ECG), and a triaxial accerome-ter. The sensors then connect with a sensor networkmote to form a wearable sensor mote. Here we choosethe commercially available TelosB board. At the PNL,the body sensors communicate with a Nokia N800 se-ries Internet tablet via a TelosB base-station board. TheSPINE functions installed on the body sensors and theN800 manage the data collection, processing, and trans-mission of the data, and can be controlled via commandsissued from the N800. Finally, at the GNL, multiplesubjects instrumented with body sensors and N800s canremotely connect to a computer server via the Internet,and permanently log the motion and biological informa-tion for higher-level applications.

Figure 1: The three-layer architecture of the DexterNet

system. The first layer is a body sensor layer (BSL),

The second is a personal network layer (PNL), and the

third layer is a global network layer (GNL). The pivotal

component of the system is a Nokia N800 series Internet

tablet at the PNL that communicates both to the BSL

via IEEE 802.15.4 and the GNL via other broadband

wireless channels.

Equipped with the versatile three-layer architectureand the open-source on-node library SPINE, DexterNetpresents a competitive framework to support a varietyof applications in healthcare, military, and consumerelectronics. For example, the fall detection function hasbeen implemented at the BSL level using SPINE on-node functions, and each motion sensor is capable ofoutputting a binary decision of a falling event. Suchfunctions reduce the amount of data needed to transmit

between the nodes and the base station.1 More sophis-ticated applications such as human activity recognitionand reconstruction of a graphical avatar for 3-D visu-alization can be implemented at the PNL level, whichrely on the full-body motion data measured by multiplemotion sensors at different key locations of the body (asshown in Figure 2).

Figure 2: Illustration of the DexterNet system instru-

mented on a wearer. The deployment includes five mo-

tion sensor motes, a Nokia N800 tablet, and a GPS po-

sitioning sensor.

1.1 Related Work

Similar to DexterNet, many existing BSN platformsembrace a hierarchical architecture for real-time sen-sor control and data management. Some representativeplatforms are shown in Table 1. A more comprehensiveliterature overview can be found in [21, 11, 24].

HealthGear [13] is a single-modality sensor networkthat integrates a low-power pulse oximeter with a smartphone via Bluetooth. CodeBlue [10] is a wireless sensorplatform intended for deployment in emergency medicalcare. It integrates a pulse oximeter and a ECG sensorwith PDAs and PCs to enhance seamless transfer of dataamong caregivers. The platform uses IEEE 802.15.4 asthe wireless protocol, and is intended to scale in densenetworks with volatile network conditions.

1Studies have shown that the power consumption requiredto successfully send one byte over a wireless channel is equiv-alent to executing between 1e3 and 1e6 instructions on anonboard processor[15]. Hence it is paramount in sensor net-works to reduce the communication cost while preserve therecognition performance.

WWBAN [11] adopts a three-layer multi-sensor plat-form that is similar to DexterNet. Multiple motion sen-sors and ECG sensors are placed on the human body.They communicate with either a PDA or a PC to pro-vide a transparent interface to the user, and an inter-face to the (remote) medical server using the Internet.However, the system is mainly comprised of proprietarysoftware, and it does not provide an open-source librarysuch as SPINE to support on-node computation anddecision-making.

Finally, ALARM-NET [21] belongs to a group of wire-less sensor networks for assisted living. The focus of thesystem is the integration of body sensors with environ-mental sensor networks in a scalable and heterogeneousarchitecture. ALARM-NET uses MICAz sensors andSTARGATE to relay the information from body sen-sors and environmental sensors to PDAs and PCs in alarge and complex indoor setting using either Bluetoothor the 802.11 protocol.

The rest of the paper is organized as follows: Section 2proposes the overall architecture of DexterNet and ex-plains the relationship among different components ofthe three-layer hierarchy. Based on the hierarchy, Sec-tion 3 first discusses the design and specification of thebody sensors in the bottom layer BSL. Section 4 thendiscusses the open-source SPINE network that providessoftware services and control of both BSL and PNL. Sec-tion 5 showcases three high-level applications: 1. Avatarvisualization. 2. Human activity recognition. 3. Inte-gration of DexterNet with portable air pollution sensorsfor the study of asthma attack. Finally, Section 6 dis-cusses some limitations of the current implementationand future directions.

2. SYSTEM ARCHITECTURE

DexterNet is comprehensive in that it is inclusive ofsensing, distributed processing of sensor data, wirelesscommunication and fusion of data, and serves as a foun-dation for higher-level applications. Although subsetsof these functionalities exist in other systems, often thesame functionalities must be re-implemented in differentexamples in order to produce a complete path for data.We hope that in providing an open system with Dex-terNet, a common platform may arise that results in areversal of this scenario where variations in functionalityare achieved through the use of a common base. Fur-thermore, the diverse nature of our team has driven thedesign requirement that DexterNet provide maximumflexibility and extensibility, with maximum potential forreusability of its components.

The structure of DexterNet is shown in Figure 3. Theopen-source SPINE framework provides the flexibility inconstructing physical components of the system at theBSL and PNL layers. Particularly, SPINE has been de-

Table 1: Comparison of existing body sensor networks with the DexterNet platform.Platforms Sensor Devices Base Devices Node Protocols Open Source Environmental Sensors

HealthGear [13] pulse oximeter smart phone Bluetooth No NoCodeBlue pulse oximeter PC 802.15.4 No No

[10] ECG, SHIMMER PDAWWBAN [11] motion, ECG PC, PDA 802.15.4 No NoALARM-NET pulse oximetry STARGATE Bluetooth No Yes

[21] motion, ECG PDA, PC 802.11 (temperature, light, PIR)DexterNet motion, ECG PDA 802.15.4 Yes Possible via SPINE

EIP, GPS PC (e.g., air pollution sensor)MICAz, SHIMMER

veloped such that there is separation in code of its sens-ing, processing and data transport features. As a result,SPINE is portable across TinyOS mote platforms, andeasily extends to support new sensors through the useof sensor drivers. Support for both the motion sensorand biosensor in DexterNet is provided in this manneron the TelosB.

Figure 3: Architecture of the DexterNet system. The

Body Sensing Layer (BSL) includes motes and attached

sensors. The Personal Network Layer (PNL) includes the

N800 portable base station and associated sensors. The

BSL and PNL are driven by SPINE. The Global Net-

work Layer (GNL) includes our applications built with

the DexterNet system.

The distributed processing facilities of DexterNet arealso provided by the SPINE framework. Currently, itprovides two modules for data processing: one that peri-odically performs feature extraction on sensor data andreports it, and a second which reports chosen featuresconditionally based upon thresholds. New data process-

ing components can also be added to SPINE withoutaffecting sensor or network communication code.

All SPINE functionalities are dynamically configuredover the air, helping to achieve the runtime flexibilityand reconfigurability we desire at the BSL and PNLlayers. SPINE includes a base station component, andallows the use of a Nokia N800 or PC at the PNL layer.The N800 allows the wearable system to be portable,and allows for the integration of GPS and other envi-ronmental sensors. Our experience has shown that inte-gration of various commercial devices, such as the N800,has come with considerable effort. DexterNet aims tomaximize the utility of such efforts.

Each of our applications, including avatar visualiza-tion, action recognition and asthma studies, is built ontop of the SPINE base station API. They each config-ure the appropriate sensors and on-node signal process-ing according to their specific goals and requirements.These applications need not depend on any specific sen-sor mote code or directly interact with the BSL. Thisensures that all applications benefit from enhancementsmade at the BSL and PNL of the system. These canand have included improvements in robustness, capac-ity, and energy consumption. Furthermore, developerscan work simultaneously on application-level software,as well as SPINE software without the tight couplingrequired in traditional application-specific sensor net-work systems. In addition, it is worth noting that theuse of a mobile base station such as the N800 reducesthe burden on each wearable node when implementingprivacy and security preserving features. One can usethe higher capacity of the base station to manage theprivacy requirements across geographic locations and toauthenticate various individuals, a scenario not possiblewith a multi-person BSL or a hybrid body and environ-mental sensor environment.

3. DESIGN OF BODY SENSORS

3.1 Motion Sensors

DexterNet supports the deployment of multiple mo-tion sensor nodes placed at different body locations (see

Figure 2), which communicate with a base station. Thesensor nodes and the base station are built using theTelosB boards. TelosB runs TinyOS on an 8MHz mi-crocontroller with 10K RAM and communicates usingthe 802.15.4 wireless protocol. Each custom-built sensorboard has a triaxial accelerometer and a biaxial gyro-scope, which is attached to TelosB (shown in Figure 4).Each axis is reported as a 12bit value to the node, indi-cating values in the range of ±2g and ±500◦/s for theaccelerometer and gyroscope, respectively. The batterylife of continuous measurement and wireless raw dataoutput is approximately 20 hours.

Figure 4: Illustration of the motion sensor node. The

sensor board on the top is a custom-built motion sensor

with a triaxial accelerometer and a biaxial gyroscope.

The middle layer is a Li-ion battery. The sensor board

on the bottom is a standard TelosB network node.

The current hardware design of the sensor contributescertain amounts of measurement error. The accelerom-eters typically require some calibration in the form ofa linear correction, as sensor output under 1g may beshifted up to 15% in some sensors. It is also worthnoting that the gyroscopes produce an indication of ro-tation under straight-line motions. Fortunately, thesesystematic errors appear to be consistent across experi-ments for a given sensor board. However, without cali-bration to correct them, the errors may affect the actionrecognition if different sets of sensors are used inter-changeably in the experiment. 2

3.2 Biosensors

The biosensor is capable of measuring acceleration,electrocardiogram (ECG), and electrical impedance pneu-mography (EIP) through four small electrodes connectedto the side of the ribcage of the subject, as shown inFigure 5 and 6. The ECG signal is used to derive heartrate and heart rate variability (HRV). The EIP signalis produced by respiration and can be used to derive a

2More sophisticated motion sensors do exist in the industry,which can utilize heterogeneous sensor fusion techniques toself-calibrate the accelerometer and gyroscope. One exampleis the Microstrain Gyro Enhanced Orientation Sensor at:http://www.microstrain.com/.

variety of breathing related parameters like respirationrate, minute ventilation volume, flow/volume curve, andinspiration/expiration times.

Figure 5: Illustration of the biosensor board (top) con-

nected to the TelosB network node (bottom). The middle

layer is a Li-ion battery. The white connector on the left

side is for the four skin electrodes.

Figure 6: The electrode locations are chosen to obtain a

good signal-to-artefact ratio (SAR) in both EIP and ECG

signals. Electrodes in both pairs are placed vertically

next to each other. The front-end pair is placed vertically

right below the pectoralis major and horizontally in the

middle between the side of the body and mid axillary

line. The back-end pair is placed on the same vertical and

horizontal location to create a sensitivity field through

and around the left lung.

Single channel ECG measurement is quite straight-forward. The main challenge is breathing measurementwith EIP technique especially for volumic parameters.So far we have tested the accuracy of the system duringergometer and running exercises [16, 17]. The accuracyof breathing minute volume assessment was degradeddue to intense motion of the body during running, butresults with average relative error of 11% were still ob-tained. Also the effect of different electrode placementson movement error susceptibility has been studied [9].

The biosensor runs real time signal processing algo-rithms that detect events in heart and respiration sig-nals and calculates physiological parameters from them.The requested parameters are sent to the base station

through the SPINE framework. This reduces the amountof network traffic compared with raw data sending andenables using a higher sampling rate needed in accu-rate HRV analysis. The possibility to assess breathing-related parameters separates this sensor from most ofthe similar projects. Cardiac and pulmonary measure-ment together provide data that can be used to derivehigh-level physiological parameters related to physicaland mental state of the subject.

The biosensor is connected to a TelosB and is in thesame form factor as the motion sensor. The batterylife of continuous measurement and raw data output isapproximately 20 hours, similar to the motion sensor.

3.3 Other Compliant Sensors

The heterogenity of DexterNet allows a wide variety ofsensors and motes to be integrated into the system. TheSPINE framework provides support for the Intel SHIM-MER and the MICAz motes as well as any sensors thatcan be attached. The SHIMMER has an onboard ac-celerometer, MicroSD slot, and ADC converters for at-taching external sensors. The MICAz has many sensorsavailable as addons, including sensors such as GPS, hu-midity, barometric pressure, ambient light, sound, mag-netometer, etc.

Our current system has a bluetooth GPS sensor thatdirectly interfaces with SPINE. Since the GPS unit itselfis not a SPINE node, the data integration is done at adifferent layer than SPINE nodes such as the motionsensor and biosensor. The GPS provides longitude andlatitude coordinates primarily in outdoor environmentsat a speed of 1 Hz.

4. THE SPINE FRAMEWORK

SPINE (Signal Processing In Node Environment) 3 isan open-source framework for distributed signal process-ing algorithms in wireless sensor networks (WSNs). Thefunctional architecture of SPINE is shown in Figure 7.It provides a set of on-node services that can be tunedand activated by the user depending on different appli-cation needs. The open-source framework speeds up thedesign of WSN applications through high-level abstrac-tions and provides support to quickly explore imple-mentation tradeoffs through fast prototyping. SPINEalso provides an efficient wireless communication proto-col for dynamic network configuration and management.Most importantly, SPINE allows all applications to im-mediately benefit from changes in the SPINE frameworkthat may improve robustness, security, and energy effi-ciency.

The SPINE framework has two main modules, one forthe sensor node side, and the other for the server/base-3The SPINE software is available for download at http://spine.tilab.com

Figure 7: The SPINE functional architecture.

station side. The node module is developed in TinyOS2.x environment. It provides the following three on-nodeservice components: 1. Communication. 2. Sensing. 3.Signal processing. Accordingly, the source code of themodule is organized in a similar manner. In the com-munication component, SPINE utilizes a time divisionmultiple access (TDMA) protocol to avoid packet colli-sion, which allocates for each node a specific time slotduring which to transmit data. All sensor drivers im-plemented in the sensing component appear similar tothe signal processing and communication components.As a result, any new sensor driver will be immediatelyavailable for all processing components. The signal pro-cessing component is similarly modular. These designfeatures make it easy to extend the SPINE framework,and allow various team members to develop differentparts of the framework simultaneously. SPINE sensingand processing functionalities are dynamically config-ured through over-the-air messaging. This allows eachapplication supported by the system to reconfigure theSPINE network as desired, quickly and easily.

The server module is implemented in Java SE andacts as the coordinator of a sensor network. It consistsof functionalities that activate and control on-node ser-vices depending on the application requirements. Theimplementation instead does not use any TinyOS spe-cific APIs and can be run independently on the underly-ing protocol stack (e.g., the ZigBee network). This hasallowed the use of a Nokia N800 tablet as a handheldbase station for the wearable sensor network. The N800provides a platform for GPS sensing through bluetooth,and Wi-Fi connectivity to allow forwarding of data tothe GNL. The use of a handheld base allows the realiza-tion of a body sensor network which can operate bothinside the home and outdoor, a key feature for support-ing a wide variety of human monitoring applications.

5. APPLICATIONS AND EVALUATION

5.1 Avatar

We first demonstrate an application called Avatar,which uses a network of motion sensors on the humanbody to reconstruct and visualize the wearer’s full-bodymotion in real time. The application can be used to re-motely monitor and assess the well being of elderly peo-ple living alone. It can also be used in tele-healthcare forphysicians to remotely record and visualize the move-ments of patients. Avatar provides much of the sameinformation about activity that could be captured byvideo, but does so providing a considerably higher levelof privacy for the monitored person. This is quite im-portant because it is unlikely that the average personwould be willing to accept continuous video surveillanceof their home. Additionally, Avatar has the benefit ofbeing derived from wearable sensors, and so is portable.For the purpose of visualizing motion, a configuration offive nodes (one on each leg, one on each arm and one onthe torso) are the minimum number of sensors required,which will be used in this section. To provide finer mea-surement of the full-body movement, more sensor nodescan be worn by a person.

Avatar makes use of the Java Monkey Engine (jME)[1] and physics plug-in [2] to render and animate a graph-ical human avatar. jME allows us to create an underly-ing skeleton with joints, then use sensor data to contin-uously change the orientation of this skeleton.

Through SPINE, each node estimates the pitch androll of its orientation in space and reports this pair ofvalues to the base station. The orientation in space ofa single sensor node is computed based on the appar-ent direction of gravity as seen by the sensor board’saccelerometer. When considered as a vector, the ac-celerometer will read the vector sum of gravity and ac-celeration resulting in movement of the sensor board.Under relaxed motion, motion component of the vectoris less than 10% of the magnitude of the gravity vector.As a result, this motion component is neglected and wecontinuously interpret the direction of the accelerom-eter vector as the direction of gravity. Although thesensor board’s gyroscope would presumably be benefi-cial in separating the gravity and motion vector compo-nents, in practice the gyroscopes indicate rotation evenwhile the sensor is moved in a strictly translational (andnon-rotational) path. As a result, the gyroscopes arecurrently ignored. This method is stateless and doesnot accumulate error, as would occur if accelerometeror gyroscope data were integrated to estimate velocityand displacement.

A snapshot of the output from Avatar is shown inFigure 8 with the physics skeleton in view. The yellowbars indicate the axes of the various skeleton joints, withgravity vectors are shown in red. At every frame, theorientation of the skeleton is compared to the data from

the sensor nodes. A simulated force is then applied toeach sensed body part to push it toward the orienta-tion reported by the sensor. The force is such that thephysics skeleton tracks the motion of the wearer, but islimited by the joints of the skeleton.

Figure 8: A screen shot for the Avatar with overlaid

image of the wearer.

5.2 Action Recognition

In addition to using graphical avatars to visualize andanalyze human poses and movements, another applica-tion of DexterNet is human action/activity recognition.Traditionally, human action recognition has been ex-tensively studied in computer vision using camera sen-sors placed in an (indoor) environment where humanusers reside. Compared with these high-power, high-bandwidth camera systems, body sensor networks suchas DexterNet have several distinct advantages: 1. Bodysensor systems do not require instrumenting the envi-ronment with cameras or other sensors. 2. Body sensorsystems also have the necessary mobility to support per-sistent monitoring of a subject during her daily activi-ties in both indoor and outdoor environments. 3. Withthe continuing miniaturization and integration of mobileprocessors and wireless sensors, it has become possibleto manufacture body sensor systems that can denselycover the human body to record and analyze very smallmovements (e.g., breathing and spine movements) withhigher accuracy than most extant vision systems. Suchaction recognition systems have been used in applica-tions such as medical-care monitoring, athlete training,tele-immersion, and human-computer interaction. Fora detailed survey of the literature, the reader is referredto [23].

We have constructed an open-source benchmark databasefor human action recognition using the DexterNet sys-tem called Wearable Action Recognition Database (WARD).

The purpose of WARD is to offer a public and relativelystable data set for quantitative comparison of existingand future algorithms for human action recognition us-ing body motion sensors. The database has been care-fully constructed under the following conditions:

1. The database contains sufficient numbers of humansubjects with a large range of age differences.

2. The designed action classes are general enough tocover most typical activities that a human subjectis expected to perform in her daily life.

3. The locations of the wearable sensors are selectedto be practical for full-fledged commercial systems.

4. The sampled action data contain sufficient varia-tion, measurement noise, and outliers in order forexisting and future algorithms to meaningfully ex-amine and compare their performance.

The data are sampled from 7 female and 13 malesubjects (in total 20 subjects) with age ranging from19 to 75. For more details about the data collection,please refer to the human subject protocol included inthe WARD database. The database also includes aMATLAB program to visualize the action data mea-sured from the five motion sensors (Figure 9).4

Figure 9: A MATLAB program that interfaces with the

TelosB base station via the series port. The program can

receive, record, and replay accelerometer and gyroscope

data from a network of motion sensors.

We have proposed a distributed recognition algorithmto classify human actions using the low-bandwidth mo-tion sensors [22, 23]. These actions include transient ac-tions, e.g., bending, lying down, and standing up; andcontinuous actions, e.g., walking, running, turning, andgoing upstairs. The algorithm classifies human actionsusing a set of training motion sequences as prior exam-ples. It is also capable of rejecting outlying actions thatare not in the training categories. The classification isoperated in a distributed fashion on individual sensornodes and a base station computer. More importantly,

4The WARD database is available for download at: http://www.eecs.berkeley.edu/~yang/software/WAR/.

the algorithm is robust and adaptive to the change of ac-tive sensors in a body network on-the-fly due to eithersensor failure or network congestion. The recognitionprecision only decreases gracefully using smaller sub-sets of active sensors. The accuracy of the framework isvalidated using the WARD database.

5.3 Public Health

DexterNet has many applications within the field ofpublic health, where the ability to objectively moni-tor the activity patterns of users may improve under-standing of exposures to environmental hazards such asair pollution that are associated with asthma attacks,chronic obstructive pulmonary disease (COPD), cardio-vascular disease, as well as premature mortality. Theaddition of the biosensor data provides a mechanismto monitor physiological responses to such exposures inreal time that may be predictive of severe disease events(e.g., an asthma attack).

The inclusion of geographic location data from theGPS is also important for such applications. In the past,spatial epidemiologic studies have relied upon rathercrude measures of location when describing a person’sexposure to environmental hazards such as air pollution.For example, some studies have simply used residentiallocations as a proxy for a person’s location. But in re-ality, individuals are mobile and have activity patternsthat may include time away from home, at work, run-ning chores, and exercising and playing. A system whichallows for continual monitoring of an individual’s loca-tion may greatly improve the assessment of exposuresfor such epidemiologic studies.

To evaluate the DexterNet system for such applica-tions, we conducted a field experiment in which the sys-tem was used to collect and process an integrated set ofdata related to an individual’s outdoor experience. Theexperiment consisted of a series of prescribed walks. Aconvenient sample of six adults (five male and one fe-male) were asked to walk a 2.4 km route. The walkincluded sections that were uphill, downhill, and flat,as well as sections that were along a busy roadway, adowntown commercial/retail area, as well as a calmerpath through a university campus. Over the course ofthe walk, various sensor data were logged, including tri-axial accelerometry and biaxial gyroscopy (at the leftwrist, waist, and left ankle positions), GPS location,and air pollution (airborne particulate matter ≤ 2.5 umin size, PM2.5). The motion data were logged at 30 Hz.GPS was logged at 1 Hz. The air pollution data werelogged separately using a Met One Aerocet 531, a hand-held particle counter that takes 2-minute samples con-tinuously during the walk. These data were combinedand processed to ascertain specific information on theindividual’s experience (e.g., assessing the magnitude of

physical activity in certain geographic locations, or theair pollution in each location that heavy activity oc-curred).

As an example, Figure 10 illustrates the GPS traceof the walking route. The application determines thechanges in elevation during the walk from the GPS data.A motion sensor at the waist was used to derive en-ergy expenditure using the Generalized Linear Model[7]. The breathing minute ventilation is derived fromthe biosensor EIP signal [17]. Heart rate is obtainedfrom the biosensor ECG signal using a simple R-peakdetection algorithm.

The GPS data were also used to map PM2.5 con-centrations from the Aerocet monitor during the walk(Figure 11). Data from three participants illustratesless spatial variability in pollutant concentrations thaninterday variability. We note that one of the days (theright panel of Figure 11) corresponds to a “Spare theAir Day”, a day when an elevated air pollution warning(typically for ozone rather than PM2.5) was issued bythe Bay Area Air Quality Management District to thegeneral public. From these data it is possible to derivean individual’s average and cumulative air pollution ex-posures and physiologic response for use in long-termepidemiologic studies.

6. DISCUSSION AND FUTURE

DIRECTIONS

In this paper, we have discussed DexterNet, a novelplatform for heterogeneous body sensor networks. Thekey tenets of DexterNet are twofold: 1. It promotesan open-source sensor environment that supports lim-ited on-node computation, robust sensor communica-tion, and online reconfigurable network management.2. The platform is versatile enough to support a varietyof existing body sensors and other future sensors thatcomply with the SPINE specifications. Through a hier-archy of three network layers, it resolves the dependencyof higher-level applications toward the implementationof wireless body sensors and communication protocols.

One advantage of the DexterNet system is its low costcompared to other existing commercial systems that aremore expensive and do not necessarily support open-source development. Currently, our system is limitedby the choice of the off-the-shelf components (e.g., theTelosB mote and the N800), which in their current stagesof development may not offer the most convenient form-factor and attractive packaging to make large-scale andlong-term use practical. However, the limitation can beeasily addressed by migrating the components to othercommercial components at the expense of increasingcost for manufacture. We are currently exploring newsolutions to improve our system for future use.

There are numerous potential services that may be

implemented through DexterNet, especially in the areaof preventive healthcare. For example, it is possible tocommunicate the data to an electronic medical recordsor personal health records system, such that a historyof a person’s activity and exposures may be used to im-prove diagnosis of health conditions. It is also possiblethrough the classification algorithms described to iden-tify conditions that are predictive of asthma attacks andwarn users to reduce physical activity and/or move in-doors. Such systems can create maps of microscale airpollution when they are deployed in sufficiently largenumbers. Currently, only regional air pollution mapsare available from the sparsely located fixed-site moni-toring that regulatory agencies implement.

The hierarchical design of DexterNet also provides at-tractive solutions to protect the wearer’s privacy, whichis mandated by the 1996 Health Insurance Portabilityand Accountability Act (HIPAA). Work on private-keyand public-key cryptography schemes for sensor net-works is applicable, but must be integrated into an ap-propriate authentication and authorization framework.In addition to using cryptography to protect the pri-vacy of data, it is important to consider other secu-rity attacks, such as injection of anomalous data andillegal data exfiltration (e.g., covert channels commu-nications [4]). Authentication, key establishment, ro-bustness to denial-of-service attack, secure routing, andnode capture are some of the security challenges in wire-less sensor networks. In the case of BSN, these issuesappear even more serious given the limited bandwidth,power supply, storage and computational resources ofthe platform. When implementing privacy and securitypreserving features becomes critical to certain high-levelapplications, the use of a mobile computer station suchas the N800 at the personal network layer reduces theburden on each wearable node.

Acknowledgments

The authors would like to thank Dr. Marco Sgroi atthe WSN Lab Berkeley, Dr. Yuan Xue at the Vander-bilt University, and Dr. Roozbeh Jafari at the Univer-sity of Texas, Dallas, for their valuable suggestions andliterature references.

7. REFERENCES

[1] Java monkey engine(http://www.jmonkeyengine.com/), September2008.

[2] jmephysics (https://jmephysics.dev.java.net/),September 2008.

[3] P. Barralon, N. Vuillerme, and N. Noury. Walkdetection with a kinematic sensor: Frequency andwavelet comparison. In Proceedings of the 28th

0 5 10 15 20 25 3040

60

80

100

120

me

ters

minutes

0 5 10 15 20 25 3013

14

15

16

17

KJ

pe

r m

inu

te

minutes

0 5 10 15 20 25 3080

100

120

140

160beats

per

min

ute

minutes

0 5 10 15 20 25 300

10

20

30

40

minutes

liters

/min

ute

Elevation

Energy Expenditure

Heart RateBreathing Minute Ventilation

Figure 10: GPS trace of campus walk with derived information from GPS, motion sensor and biosensor. Circles on

the map indicate elapsed time in minutes.

IEEE EMBS Annual International Conference,pages 1711–1714, 2006.

[4] National Computer Security Center. A guide tounderstanding covert channel analysis of trustedsystems. In NCSC-TG-030, Covert ChannelAnalysis of Trusted Systems (Light Pink Book)States Department of Defense (DoD) RainbowSeries, 1993.

[5] R. Chakravorty. A programmable servicearchitecture for mobile medical care. InProceedings of the IEEE International Conferenceon Pervasive Computing and CommunicationsWorkshop, 2006.

[6] J. Chen, K. Kwong, D. Chang, J. Luk, andR. Bajcsy. Wearable sensors for reliable falldetection. In Proceedings of the IEEE Engineeringin Medicine and Biology Conference, pages3551–3554, 2005.

[7] K. Chen and M. Sun. Improving energyexpenditure estimation by using a triaxialaccelerometer. Journal of Applied Physiology,83:2112–2122, 1997.

[8] T. Degen, H. Jaeckel, M. Rufer, and S. Wyss.SPEEDY: A fall detector in a wrist watch. InProceedings of the IEEE International Symposiumon Wearable Computers, pages 184–187, 2003.

[9] O. Lahtinen, V-P. Seppa, J. Vaisanen, andJ. Hyttinen. Optimal electrode configurations forimpedance pneumography during sports activities.In Proceedings of the 4th European Congress forMedical and Biomedical Engineering, 2008.

[10] D. Malan, T. Fulford-Jones, M. Welsh, andS. Moulton. CodeBlue: An ad hoc sensor networkinfrastructure for emergency medical care. InProceedings of the International Workshop onWearable and Implantable Body Sensor Networks,

Figure 11: GPS traces of campus walks for 3 participants, with geocoded PM2.5 measurements (circles), suggesting

less spatial variability than interday variability. Note the right panel illustrates the data for a “Spare the Air Day”,

a day when an elevated air pollution warning was issued by the Bay Area Air Quality Management District to the

general public.

2004.

[11] A. Milenkovic, C. Otto, and E. Jovanov. Wirelesssensor networks for personal health monitoring:Issues and an implementation. (in press)Computer Communications, 2006.

[12] M. Moron, E. Casilari, R. Luque, and J. Gazquez.A wireless monitoring system for pulse-oximetrysensors. In Proceedings of the 2005 SystemsCommunications, 2005.

[13] N. Oliver and F. Flores-Mangas. HealthGear: Areal-time wearable system for monitoring andanalyzing physiological signals. In Proceedings ofthe International Workshop on Wearable andImplantable Body Sensor Networks, pages 61–64,2006.

[14] I. Pappas, T. Keller, S. Mangold, M. Popovic,V. Dietz, and M. Morari. A reliablegyroscope-based gait-phase detection sensorembedded in a shoe insole. IEEE Sensors Journal,4(2):268–274, 2004.

[15] C. Sadler and M. Martonosi. Data compressionalgorithms for energy-constrained devices in delaytolerant networks. In Proceedings of the ACMConference on Embedded Networked SensorSystems, pages 265–278, 2006.

[16] V-P Seppa, J. Vaisanen, P. Kauppinen,J. Malmivuo, and J. Hyttinen. Measuringrespirational parameters with a wearablebioimpedance device. In Proceedings of the 13thInternational Conference on ElectricalBioimpedance, 2007.

[17] V-P. Seppa, J. Vaisanen, O. Lahtinen, andJ. Hyttinen. Assessment of breathing parametersduring running with a wearable bioimpedance

device. In Proceedings of the 4th EuropeanCongress for Medical and Biomedical Engineering,2008.

[18] A. Sixsmith and N. Johnson. A smart sensor todetect the falls of the elderly. PervasiveComputing, pages 42–47, 2004.

[19] The SPINE Team. The spine manual version 1.2.Technical report, Telecom Italia Lab, 2008.

[20] G. Williams, K. Doughty, K. Cameron, andD. Bradley. A smart fall and activity monitor fortelecare applications. In Proceedings of the IEEEInternational Conference in Medicine and BiologySociety, 1998.

[21] A. Wood, G. Virone, T. Doan, Q. Cao, L. Selavo,Y. Wu, L. Fang, Z. He, S. Lin, and J. Stankovic.ALARM-NET: Wireless sensor networks forassisted-living and residential monitoring.Technical report, Department of ComputerScience, University of Virginia, 2006.

[22] A. Yang, R. Jafari, P. Kuryloski, S. Iyengar,S. Sastry, and R. Bajcsy. Distributedsegmentation and classification of human actionsusing a wearable sensor network. In Proceedings ofthe CVPR Workshop on Human CommunicativeBehavior Analysis, 2008.

[23] A. Yang, R. Jafari, S. Sastr, and R. Bajcsy.Distributed recognition of human actions usingwearable motion sensor networks. Submitted toJournal of Ambient Intelligence and SmartEnvironments, 2008.

[24] J. Yick, B. Mukherjee, and D. Ghosal. Wirelesssensor network survey. Computer Networks,52(12):2292–2330, 2008.

Electronic Acknowledgement Receipt - Peopleyang/paper/UCB-BSN... · 2011-03-30 · This...

Documents

Transcript of Electronic Acknowledgement Receipt - Peopleyang/paper/UCB-BSN... · 2011-03-30 · This...