Agregar texto nuevo a Sklearn TFIDIF Vectorizer (Python)
¿Hay una función para agregar al corpus existente? Ya he generado mi matriz, estoy buscando agregar periódicamente a la tabla sin volver a procesar todo el sha-bang
p.ej;
articleList = ['here is some text blah blah','another text object', 'more foo for your bar right now']
tfidf_vectorizer = TfidfVectorizer(
max_df=.8,
max_features=2000,
min_df=.05,
preprocessor=prep_text,
use_idf=True,
tokenizer=tokenize_text
)
tfidf_matrix = tfidf_vectorizer.fit_transform(articleList)
#### ADDING A NEW ARTICLE TO EXISTING SET?
bigger_tfidf_matrix = tfidf_vectorizer.fit_transform(['the last article I wanted to add'])