Given a data-frame:

`d1 <-c("A","B","C","A") d2 <-c("A","V","C","F") d3 <-c("B","V","E","F") d4 <-c("A","B","C","A") data.frame(d1,d2,d3,d4) d1 d2 d3 d4 1 A A D A 2 B V B B 3 C C C C 4 A F A A `

Also given that each row may have a unique pattern such that the occurrence of the values A,D,A (first row) represents a unique pattern assigned to a class 1 and F,A,A last row also represents a unique pattern assigned a class 4.

I would like to manipulate the data-frame to search for rows that contain such 'unique patterns' and return a new column that classifies them such that, 0 represents rows that do not have any of the patterns. The pattern has to occur exactly as indicated.

` d1 d2 d3 d4 class 1 A A D A 1 2 B V B B 0 3 C C C C 0 4 A F A A 4 `

I tried to use a select statement with a concat qualifier using package sqldf, but it does not provide a useful approach.

I would appreciate ideas on how to perform the search or if there are relevant packages to perform this type of search.

Thank you

#### Best Answer

Suppose the entries to data.frame contain single uppercase letters. Suppose that we have a vector containing the patterns and that only one pattern can be in one row.

`d1 <-c("A","B","C","A") d2 <-c("A","V","C","F") d3 <-c("B","V","E","F") d4 <-c("A","B","C","A") dd <- data.frame(d1,d2,d3,d4) > dd d1 d2 d3 d4 1 A A B A 2 B V V B 3 C C E C 4 A F F A pats <- c("ABA","FFA") pat.fun <- function(r,pats) { rr <- paste(r,collapse="") tmp <- sapply(pats,function(p)grep(p,rr)) res <- which(tmp==1) if(length(res)==0) res <-0 res } dd$class <- apply(dd,1,pats.fun,pats=pats) > dd d1 d2 d3 d4 class 1 A A B A 1 2 B V V B 0 3 C C E C 0 4 A F F A 2 `

This is an example, the code certainly does not look like very efficient.