I have a dataframe called my mydf. I want to split the contents in columns ASM and GPM based on the format given in the FORMAT column and get the result. So basically, there will be as many columns for ASM and GPM columns as there are total unique elements (i.e. 5 different unique elements) in FORMAT column separated by : to unwind in the result. Then need to place the right value in the right columns (with .GT, .FT, and so on) as indicated in FORMAT column.
mydf <- structure(list(`#CHROM` = c(1L, 1L, 1L), POS = c(10490L, 10493L,
10494L), FORMAT = c("GT:FT:GQ", "GT:PS:GL", "GT:PS:FT"), ASM = c("1/1:TRUE:4,2,333",
"./.:.:.", "0/1:.:VQLOW"), GPM = c("./.:.:.", "1/1:4:2,233",
"0/1:22:VQHIGH")), .Names = c("#CHROM", "POS", "FORMAT", "ASM",
"GPM"), class = "data.frame", row.names = c(NA, -3L))
result:
result <- structure(list(`#CHROM` = c(1L, 1L, 1L), POS = c(10490L, 10493L,
10494L), FORMAT = c("GT:FT:GQ", "GT:PS:GL", "GT:PS:FT"), ASM.GT = c("1/1",
"./.", "0/1"), ASM.FT = c("TRUE", NA, "VQLOW"), ASM.GQ = c("4,2,333",
NA, NA), ASM.PS = c(NA, NA, NA), ASM.GL = c(NA, NA, NA), GPM.GT = c("./.",
"1/1", "0/1"), GPM.FT = c(NA, NA, "VQHIGH"), GPM.GQ = c(NA, NA,
NA), GPM.PS = c(NA, 4L, 22L), GPM.GL = c(NA, 2233L, NA)), .Names = c("#CHROM",
"POS", "FORMAT", "ASM.GT", "ASM.FT", "ASM.GQ", "ASM.PS", "ASM.GL",
"GPM.GT", "GPM.FT", "GPM.GQ", "GPM.PS", "GPM.GL"), class = "data.frame", row.names = c(NA,
-3L))