Filtering data in R

Filtering data in R refers to selecting a subset of the data that meets certain conditions. The `dplyr` package provides a convenient set of functions for filtering and manipulating data frames in R. Here are some examples of how to filter data using `dplyr`:

1. Filtering rows based on a condition:
You can use the `filter()` function to select rows from a data frame that meet a certain condition. For example:

library(dplyr)

df <- data.frame(name=c("Alice", "Bob", "Charlie"), age=c(25, 30, 35)) subset_df <- filter(df, age > 30) # Select rows from df where age is greater than 30


2. Filtering rows based on multiple conditions:
You can use the `&` operator to specify multiple conditions in the `filter()` function. For example:

subset_df2 <- filter(df, age > 30 & name != “Charlie”) # Select rows from df where age is greater than 30 and name is not “Charlie”


3. Filtering rows based on a pattern match:
You can use the `grepl()` function to select rows from a data frame that match a certain pattern. For example:

df2 <- data.frame(name=c("Alice", "Bob", "Charlie"), city=c("New York", "Boston", "Chicago")) subset_df3 <- filter(df2, grepl("New", city))) # Select rows from df2 where city contains the substring "New" 4. Selecting columns: You can use the `select()` function to select specific columns from a data frame. For example:

subset_df4 <- select(df, name)   # Select the name column from df
subset_df5 <- select(df, name, age)   # Select the name and age columns from df

These are just a few examples of how to filter and manipulate data in R using the `dplyr` package. Depending on the type of data and the criteria you want to use, there may be other functions and techniques that are more appropriate for your needs. It's always a good idea to consult the R documentation or search online for examples and tutorials on how to filter and manipulate data in R.