Filter or Label Data Based on Sampling Rate Threshold

Description

This function allows users to set a sampling rate threshold and choose to either remove the data that falls below the threshold or label it as "bad." Users can apply this threshold either at the subject level, the trial level, or both.

Usage

filter_sampling_rate(
  data,
  threshold = NA,
  action = c("remove", "label"),
  by = c("subject", "trial", "both")
)

Arguments

data

A dataframe that contains the data to be processed. The dataframe should include the columns:

‘subject’
Unique identifier for each participant in the dataset.
‘med_SR’
Subject-level median sampling rate (Hz). This represents the median sampling rate for a subject across trials.
‘SR’
Trial-level sampling rate (Hz). This represents the sampling rate for each specific trial.
threshold Numeric value specifying the sampling rate threshold. Data falling below this threshold will either be removed or labeled as "bad".
action

Character string specifying whether to "remove" data that falls below the threshold or "label" it as bad. Acceptable values are ‘"remove"’ or ‘"label"’.

‘"remove"’
Removes rows from the dataset where the sampling rate falls below the threshold.
‘"label"’
Adds new columns ‘is_bad_subject’ and/or ‘is_bad_trial’ that flag rows where the sampling rate falls below the threshold.
by

Character string specifying whether the threshold should be applied at the "subject" level, the "trial" level, or "both". Acceptable values are ‘"subject"’, ‘"trial"’, or ‘"both"’.

‘"subject"’
Applies the threshold to the subject-level median sampling rate (‘med_SR’).
‘"trial"’
Applies the threshold to the trial-level sampling rate (‘SR’).
‘"both"’
Applies the threshold to both the subject-level (‘med_SR’) and trial-level (‘SR’) rates. Data is removed/labeled if either rate falls below the threshold.

Value

A dataframe with either rows removed or new columns (‘is_bad_subject’, ‘is_bad_trial’) added to indicate whether the data is below the threshold. Additionally, messages will inform the user how many subjects and trials were removed or labeled as "bad."

Output

The function will either return a dataset with rows removed based on the sampling rate threshold or add new columns, ‘is_bad_subject’ and/or ‘is_bad_trial’ to the dataset, which indicates whether the data is considered "bad" (i.e., below the sampling rate threshold).