pyodbc / sqlAchemy permite ejecutar rápidamente muchos

Question

Aug 23, 2018, 11:18 AM

pyodbc / sqlAchemy permite ejecutar rápidamente muchos

En respuesta a mi preguntaCómo acelerar la disputa de datos MUCHO en Python + Pandas + sqlAlchemy + MSSQL / T-SQL Fui amablemente dirigido a Acelerando pandas.DataFrame.to_sql con fast_executemany de pyODBC by @ IljaEverilä.

NB Para propósitos de prueba, solo estoy leyendo / escribiendo 10k filas.

Agregué el detector de eventos y a) se llama a la función, pero b) claramente ejecutar muchas no está configurado ya que el IF falla y cursor.fast_executemay no está configurad

def namedDbSqlAEngineCreate(dbName):
    # Create an engine and switch to the named db
    # returns the engine if successful and None if not
    # 2018-08-23 added fast_executemany accoding to this https://stackoverflow.com/questions/48006551/speeding-up-pandas-dataframe-to-sql-with-fast-executemany-of-pyodbc?rq=1
    engineStr = 'mssql+pyodbc://@' + defaultDSN
    engine = sqla.create_engine(engineStr, echo=False)

    @event.listens_for(engine, 'before_cursor_execute')
    def receive_before_cursor_execute(conn, cursor, statement, params, context, executemany):
        # print("FUNC call")
        if executemany:
            print('executemany')
            cursor.fast_executemany = True
    try:
        engine.execute('USE ' +dbName)
        return(engine)
    except sqla.exc.SQLAlchemyError as ex:
        if ex.orig.args[0] == '08004':
            print('namedDbSqlAEngineCreate:Database %s does not exist' % dbName)
        else:
            print(ex.args[0])
        return(None)

Naturalmente no hay cambio en la velocidad.

El código en mi pregunta original no ha cambiado en to_sql

nasToFillDF.to_sql(name=tempTableName, con=engine.engine, if_exists='replace', chunksize=100, index=False)

porque intenté, según el ejemplo, establecer chunksize = None y recibir el mensaje de error (que había encontrado anteriormente)

(pyodbc.ProgrammingError) ('El SQL contiene -31072 marcadores de parámetros, pero se proporcionaron 100000 parámetros', 'HY000')

¿Qué he hecho mal? Supongo que el parámetro executemany del reciben_before_cursor_execute no está configurado, pero si esa es la respuesta, no tengo idea de cómo solucionarlo.

Setup es pyodbc 4.0.23, sqlAchemy 1.2.6, Python 3.6.something