Mit psycopg2 und Lambda Redshift (Python) aktualisieren

Question

Apr 13, 2016, 09:35 PM

amazon-redshift amazon-web-services aws-lambda python aws-sdk

Mit psycopg2 und Lambda Redshift (Python) aktualisieren

Ich versuche, Redshift von einer Lambda-Funktion mit Python zu aktualisieren. Dazu versuche ich, 2 Codefragmente zu kombinieren. Beide Fragmente sind funktionsfähig, wenn ich sie separat ausführe.

Updating Redshift von PyDev für Eclipse

import psycopg2

conn_string = "dbname='name' port='0000' user='name' password='pwd' host='url'"
conn = psycopg2.connect(conn_string)

cursor = conn.cursor()

cursor.execute("UPDATE table SET attribute='new'")
conn.commit()
cursor.close()

Empfangen von Inhalten, die in den S3-Bucket hochgeladen wurden (vorgefertigte Vorlage für Lambda verfügbar)

from __future__ import print_function

import json
import urllib
import boto3

print('Loading function')

s3 = boto3.client('s3')


def lambda_handler(event, context):
    #print("Received event: " + json.dumps(event, indent=2))

    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')

    try:
        response = s3.get_object(Bucket=bucket, Key=key)
        print("CONTENT TYPE: " + response['ContentType'])
        return response['ContentType']

    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

Da beide Segmente funktionierten, habe ich versucht, sie zu kombinieren, damit ich Redshift beim Hochladen einer Datei auf s3 aktualisieren kann:

from __future__ import print_function

import json
import urllib
import boto3
import psycopg2

print('Loading function')

s3 = boto3.client('s3')


def lambda_handler(event, context):
    #print("Received event: " + json.dumps(event, indent=2))

    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')

    conn_string = "dbname='name' port='0000' user='name' password='pwd' host='url'"

    conn = psycopg2.connect(conn_string)

    cursor = conn.cursor()

    cursor.execute("UPDATE table SET attribute='new'")
    conn.commit()
    cursor.close()

    try:
        response = s3.get_object(Bucket=bucket, Key=key)
        print("CONTENT TYPE: " + response['Body'].read())
        return response['Body'].read()
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

Da ich eine externe Bibliothek verwende, muss ich ein Bereitstellungspaket erstellen. Ich habe einen neuen Ordner (lambda_function1) erstellt und meine .py-Datei (lambda_function1.py) in diesen Ordner verschoben. Ich habe den folgenden Befehl ausgeführt, um psycopg2 in diesem Ordner zu installieren:

pip install psycopg2 -t \lambda_function1

Ich erhalte das folgende Feedback:

Collecting psycopg2
  Using cached psycopg2-2.6.1-cp34-none-win_amd64.whl
Installing collected packages: psycopg2
Successfully installed psycopg2-2.6.1

Ich habe dann den Inhalt des Verzeichnisses gezippt. Und diese Zip zu meiner Lambda-Funktion hochgeladen. Wenn ich ein Dokument in den Bucket hochlade, den die Funktion überwacht, wird in meinem Cloudwatch-Protokoll der folgende Fehler angezeigt:

Unable to import module 'lambda_function1': No module named _psycopg

Wenn ich in der Bibliothek nachschaue, ist das einzige, was "_psycopg" heißt, "_psycopg.pyd".

Was verursacht dieses Problem? Ist es wichtig, dass Lambda Python 2.7 verwendet, wenn ich 3.4 verwende? Ist es wichtig, dass ich den Inhalt meiner Datei auf einem Windows-Computer gezippt habe? Konnte sich jemand erfolgreich mit Redshift von lambda verbinden?