Separadores múltiples para la misma entrada de archivo R
He buscado respuestas, pero solo he encontrado cosas que se refieren a C o C #. Me doy cuenta de que gran parte de R está escrito en C, pero mi conocimiento de ello es inexistente. También soy relativamente nuevo en R. Estoy usando el actual Rstudio.
Esto es similar a lo que quiero, creo.Lea los datos de manera eficiente con múltiples líneas de separación en R
Tengo un archivo csv pero una variable es una cadena con valores separados por_
y-
Y me gustaría saber si hay un paquete o código adicional que haga lo siguiente en la lectura. mando.
"1","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",0,218,4,93,1377907200000
"2","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",0,390,5,157,1377993600000
"3","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",0,376,5,193,1.37808e+12
"4","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",1,35,1,15,1377907200000
"5","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",12,11258,117,2843,1377993600000
"6","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",5,4659,56,1826,1.37808e+12
"7","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_ANDROID","2013-08-31 13:39:55.0","2013-10-16 13:58:00.0",7,7296,136,2684,1377907200000
"8","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_IOS_IPAD","2013-08-31 13:18:21.0","2013-10-16 13:58:00.0",0,4533,35,1632,1377907200000
"9","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_IOS_IPAD","2013-08-31 13:18:21.0","2013-10-16 13:58:00.0",0,421,6,161,1377993600000
"10","Client1","Name2","*Name3_Name1_KB_MobApp_M-13-44_AU_PI Likes by KB_IOS_IPAD","2013-08-31 13:18:21.0","2013-10-16 13:58:00.0",0,57,2,23,1.37808e+12
Fila de ejemplo:
Name Name1 *XYZ_Name3_KB_MobApp_M-18-25_AU_PI ANDROID 2013-09-32 14:39:55.0 2013-10-16 13:58:00.0 0 218 4 93 1377907200000
Así que es fácil de leer
results <- read.delim("~/results", header=F)
pero luego todavía tengo la cadena * XYZ_Name3_KB_MobApp_M-18-25_AU_PI
Salida deseada (separada por_
y por-
):
Name Name1 *XYZ Name3 KB MobApp M 18 25 AU PI ANDROID 2013-09-32 14:39:55.0 2013-10-16 13:58:00.0 0 218 4 93 1377907200000
pero no dividir la cadena de tiempo.
---- Gracias a @Henrik y @AnandaMahto por el código y el paquete. ----
library(splitstackshape)
# split concatenated column by `_`
df4 <- concat.split(data = df3, split.col = "V3", sep = "_", drop = TRUE)
# split the remaining concatenated part by `-`
df5 <- concat.split(data = df4, split.col = "V3_5", sep = "-", drop = TRUE)