Airflow 1.10 Error de instalación
Tengo un entorno Airflow que funciona con Airflow versión 1.9 que se ejecuta en una instancia de Amazon EC2. Necesito actualizar a la última versión de Airflow que es 1.10. Tengo la opción de actualizar desde la versión 1.9 o instalar 1.10 recientemente en un nuevo servidor. La versión 1.10 de Airflow no aparece en Pip, así que la estoy instalando desde Git a través de este comando,
pip-3.6 install git+git://github.com/apache/incubator-airflow.git@v1-10-stable
Este comando instala con éxito la versión 1.10 de Airflow. Puede ver eso ejecutando el comandoairflow version
y viendo la salida,
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
v1.10.0
Cuando intenté iniciar el programador Airflow conairflow scheduler
Me sale la siguiente excepción,
ModuleNotFoundError: No module named 'MySQLdb'
[2018-08-14 14:03:16,195] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it:
[2018-08-14 14:03:16,195] {celery_executor.py:113} ERROR - No module named 'MySQLdb'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync
state = task.state
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state
return self._get_task_meta()['status']
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta
meta = self._get_task_meta_for(task_id)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner
return fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for
session = self.ResultSession()
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
**self.engine_options)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory
engine, session = self.create_session(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session
engine = self.get_engine(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine
return create_engine(dburi, poolclass=NullPool)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine
return strategy.create(*args, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
return __import__('MySQLdb')
ModuleNotFoundError: No module named 'MySQLdb'
[2018-08-14 14:03:16,196] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it:
[2018-08-14 14:03:16,196] {celery_executor.py:113} ERROR - No module named 'MySQLdb'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync
state = task.state
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state
return self._get_task_meta()['status']
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta
meta = self._get_task_meta_for(task_id)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner
return fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for
session = self.ResultSession()
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
**self.engine_options)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory
engine, session = self.create_session(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session
engine = self.get_engine(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine
return create_engine(dburi, poolclass=NullPool)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine
return strategy.create(*args, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
return __import__('MySQLdb')
ModuleNotFoundError: No module named 'MySQLdb'
[2018-08-14 14:03:16,197] {celery_executor.py:112} ERROR - Error syncing the celery executor, ignoring it:
[2018-08-14 14:03:16,197] {celery_executor.py:113} ERROR - No module named 'MySQLdb'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 94, in sync
state = task.state
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 471, in state
return self._get_task_meta()['status']
File "/usr/local/lib/python3.6/site-packages/celery/result.py", line 410, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/usr/local/lib/python3.6/site-packages/celery/backends/base.py", line 365, in get_task_meta
meta = self._get_task_meta_for(task_id)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 53, in _inner
return fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 122, in _get_task_meta_for
session = self.ResultSession()
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
**self.engine_options)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 59, in session_factory
engine, session = self.create_session(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 45, in create_session
engine = self.get_engine(dburi, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/backends/database/session.py", line 42, in get_engine
return create_engine(dburi, poolclass=NullPool)
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 391, in create_engine
return strategy.create(*args^C[2018-08-14 14:03:16,424] {jobs.py:1585} INFO - Exited execute loop
[2018-08-14 14:03:16,433] {jobs.py:1599} INFO - Terminating child PID: 13615
Esto es lo que tiene mi carpeta lib,
[/usr/local/lib/python3.6/site-packages]# cd /usr/local/lib64/python3.6/site-packages/sqlalchemy/
root@ip-1-2-3-4
[/usr/local/lib64/python3.6/site-packages/sqlalchemy]# ll
total 320
drwxr-xr-x 3 root root 4096 Aug 13 17:17 connectors
-rwxr-xr-x 1 root root 40456 Aug 13 17:17 cprocessors.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 1 root root 51408 Aug 13 17:17 cresultproxy.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 1 root root 21944 Aug 13 17:17 cutils.cpython-36m-x86_64-linux-gnu.so
drwxr-xr-x 3 root root 4096 Aug 13 17:17 databases
drwxr-xr-x 10 root root 4096 Aug 13 17:17 dialects
drwxr-xr-x 3 root root 4096 Aug 13 17:17 engine
drwxr-xr-x 3 root root 4096 Aug 13 17:17 event
-rwxr-xr-x 1 root root 49746 Mar 6 14:01 events.py
-rwxr-xr-x 1 root root 12030 Mar 6 14:01 exc.py
drwxr-xr-x 4 root root 4096 Aug 13 17:17 ext
-rwxr-xr-x 1 root root 2249 Mar 6 14:01 __init__.py
-rwxr-xr-x 1 root root 3093 Mar 6 14:01 inspection.py
-rwxr-xr-x 1 root root 10967 Mar 6 14:01 interfaces.py
-rwxr-xr-x 1 root root 6712 Mar 6 14:01 log.py
drwxr-xr-x 3 root root 4096 Aug 13 17:17 orm
-rwxr-xr-x 1 root root 49883 Mar 6 14:01 pool.py
-rwxr-xr-x 1 root root 5217 Mar 6 14:01 processors.py
drwxr-xr-x 2 root root 4096 Aug 13 17:17 __pycache__
-rwxr-xr-x 1 root root 1200 Mar 6 14:01 schema.py
drwxr-xr-x 3 root root 4096 Aug 13 17:17 sql
drwxr-xr-x 5 root root 4096 Aug 13 17:17 testing
-rwxr-xr-x 1 root root 1713 Mar 6 14:01 types.py
drwxr-xr-x 3 root root 4096 Aug 13 17:17 util
root@ip-1-2-3-4
[/usr/local/lib64/python3.6/site-packages/sqlalchemy]# pwd
/usr/local/lib64/python3.6/site-packages/sqlalchemy
root@ip-1-2-3-4
[/usr/local/lib64/python3.6/site-packages/sqlalchemy]# cd /usr/local/lib/python3.6/site-packages/sqlalchemy/
bash: cd: /usr/local/lib/python3.6/site-packages/sqlalchemy/: No such file or directory
Estoy confundido por qué la instalación de Airflow no se ocupó de todas sus dependencias necesarias. ¿Estoy instalando Airflow incorrectamente? Realmente necesito estar en la versión 1.10 porque la versión 1.9 tiene un error importante como se descubrióaqu yaqu.