Skip to contentAlexandre Ferreira - Software Developer

Activity recognition using sequential sensory data model


Created Jan 15, 2022 – Last Updated Jan 15, 2022

Community

In this project, we developed a model for activity recognition using sequential sensory data. The dataset used for this project includes body motion and vital signs recordings for ten volunteers performing various physical activities. The data was collected using wearable sensors placed on the chest, right wrist, and left ankle of the subjects.

Activity signal acquisition from handheld smart devices

Figure 1. Activity signal acquisition from handheld smart devices.

Source: MDPI, url (opens in a new tab).

#Dataset

The dataset comprises recordings of 12 physical activities, including standing still, sitting, lying down, walking, climbing stairs, waist bends, arm elevation, knee bends, cycling, jogging, running, and jumping. The data is stored in log files, with each file containing samples recorded for all sensors.

#Data Columns

  • Acceleration from the chest sensor (X, Y, Z axes)
  • Electrocardiogram signal (lead 1 and lead 2)
  • Acceleration from the left-ankle sensor (X, Y, Z axes)
  • Gyro from the left-ankle sensor (X, Y, Z axes)
  • Magnetometer from the left-ankle sensor (X, Y, Z axes)
  • Acceleration from the right-lower-arm sensor (X, Y, Z axes)
  • Gyro from the right-lower-arm sensor (X, Y, Z axes)
  • Magnetometer from the right-lower-arm sensor (X, Y, Z axes)
  • Label (activity identifier)

#Model Pipeline

#1. Parameterization

We defined several constants for the project, including the class dictionary, sample rate, overlap block size, overlap ratio, sequence length, and the number of features for each domain.

DATA_CLASS_DICTIONARY = ["None", "Standing", "Sitting", "Lying", "Walking", "Climbing stairs", "Waist bending", "Arm elevation", "Knees bending", "Cycling", "Jogging", "Running", "Jumping"]
DATA_SAMPLE_RATE = 50
DATA_OVERLAP_BLOCKSIZE = 250
DATA_OVERLAP_RATIO = 0.5
DATA_SEQUENCE_LENGTH = 12
DATA_MODEL_NUM_FEATURES_ACC_TEMPSTAT = 27
DATA_MODEL_NUM_FEATURES_ACC_SPECT = 25
DATA_MODEL_NUM_FEATURES_ECG = 6

#2. Download, Extract, and Load Data

We downloaded and extracted the dataset from the UCI Machine Learning Repository. The data was then loaded into pandas dataframes for further processing.

if not os.path.exists('./data'):
    download_and_extract('data')
for i in range(1, 11):
    data_file = f'data/mHealth_subject{i}.log'
    data_dataframes[i] = read_dataset(file_path=data_file, type='log')
    print(f'{data_file}:', data_dataframes[i].values.shape)

#3. Preprocessing Data

We removed data entries with null or unused labels and upsampled the electrocardiogram signal to meet the required sample rate for processing.

signal_new = scipy.signal.resample(signal, len(signal) * 2)
Extracted and pre-processed data

Figure 2. Extracted and pre-processed data.

#4. Extract Features Using Overlapping Sliding Window Method

We used an overlapping sliding window method to extract features from the data. The features were extracted for each window of data.

for i in range(1, 11):
    data = data_dataframes[i]
    X_windows = window_splitter(data, DATA_OVERLAP_BLOCKSIZE, DATA_OVERLAP_RATIO)
    for X_window in X_windows:
        X_window_features = extract_features(X_window, DATA_SAMPLE_RATE)

#5. Split Test and Train Data

We split the data into training and testing sets using stratified K-Folds cross-validation.

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, stratify=y)

#6. Fit Model with Train Data

We trained the model using the training data and evaluated its performance on the test data.

data_model.fit(X_train, y_train)

#7. Normalize Data and Print Best Classifier Score

We normalized the test data and printed the best classifier score.

X_test = data_model.normalize(X_test)
print('Test score:', data_model.score(X_test, y_test))

#8. Save Model as a Binary File

We saved the trained model as a binary file for future use.

data_model.save_model()

#Results

The model achieved high accuracy in recognizing the activities from the sensory data. The results were visualized using various plots, including time-domain and power spectral density (PSD) plots.

#Conclusion

This project demonstrates the effectiveness of using sequential sensory data for activity recognition. The model pipeline, from data preprocessing to feature extraction and model training, provides a comprehensive approach to building an accurate activity recognition system. The results show that the model can successfully recognize various activities with high accuracy.

#References

Similar code found with 1 license type


Give feedback - Discuss on Twitter
Share on Twitter

alexjorgef.com

This is my personal website made with and . Respectfully inspired on LekoArts code.

Writing
NotebookAwesomes

© 2025. Privacy Policy. Legal Notice.