Wie kann man den Typ der verschachtelten Datenstrukturen in Python bestimmen?

Question

Jan 06, 2016, 05:55 PM

Wie kann man den Typ der verschachtelten Datenstrukturen in Python bestimmen?

Ich übersetze gerade Python nach F #, speziell Neuronale Netze-und-Deep-Learning.

Um sicherzustellen, dass die Datenstrukturen korrekt übersetzt wurden, werden die Details der in Python verschachtelten Typen benötigt. DasArt( -Funktion funktioniert für einfache Typen, aber nicht für verschachtelte Typen.

Zum Beispiel in Python:

> data = ([[1,2,3],[4,5,6],[7,8,9]],["a","b","c"])
> type(data)
<type 'tuple'>

only gibt nur den Typ der ersten Ebene an. Über die Arrays im Tupel ist nichts bekannt.

Ich hatte gehofft auf so etwas wie das, was F # macht

> let data = ([|[|1;2;3|];[|4;5;6|];[|7;8;9|]|],[|"a";"b";"c"|]);;

val data : int [] [] * string [] =
  ([|[|1; 2; 3|]; [|4; 5; 6|]; [|7; 8; 9|]|], [|"a"; "b"; "c"|])

Unterschrift unabhängig vom Wert zurückgeben

int [] [] * string []

*         is a tuple item separator  
int [] [] is a two dimensional jagged array of int  
string [] is a one dimensional array of string

Kann oder wie geht das in Python?

TLDR;

Zurzeit verwende ich PyCharm mit dem Debugger und klicke im Variablenfenster auf die Ansichtsoption für eine einzelne Variable, um die Details anzuzeigen. Das Problem ist, dass die Ausgabe die Werte zusammen mit den Typen vermischt enthält und ich nur die Typensignatur benötige. Wenn die Variablen wie (float [50000] [784], int [50000]) sind, stören die Werte. Ja, ich ändere die Größe der Variablen im Moment, aber das ist eine Problemumgehung und keine Lösung.

z.B

UsingPyCharm Community

(array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],
        ...,     
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32),
  array([7, 2, 1, ..., 4, 5, 6]))

Using Spyder

UsingVisual Studio Community mitPython Tools für Visual Studio

(array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],    
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],  
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],  
        ...,   
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],  
        [ 0.,  0.,  0., ...,  0.,  0.,  0.],  
        [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32),  
  array([5, 0, 4, ..., 8, 4, 8], dtype=int64))

BEARBEITEN

Da diese Frage angestarrt wurde, sucht anscheinend jemand nach mehr Details, hier ist meine modifizierte Version, die auch mit @ umgehen kanumpy ndarray. Dank an Vlad für die erste Version.

Auch wegen der Verwendung einer Variation vonRun Length Encoding gibt es keine Verwendung mehr von? für heterogene Typen.

# Note: Typing for elements of iterable types such as Set, List, or Dict 
# use a variation of Run Length Encoding.

def type_spec_iterable(iterable, name):
    def iterable_info(iterable):
        # With an iterable for it to be comparable 
        # the identity must contain the name and length 
        # and for the elements the type, order and count.
        length = 0
        types_list = []
        pervious_identity_type = None
        pervious_identity_type_count = 0
        first_item_done = False
        for e in iterable:
            item_type = type_spec(e)
            if (item_type != pervious_identity_type):
                if not first_item_done:
                    first_item_done = True
                else:
                    types_list.append((pervious_identity_type, pervious_identity_type_count))
                pervious_identity_type = item_type
                pervious_identity_type_count = 1
            else:
                pervious_identity_type_count += 1
            length += 1
        types_list.append((pervious_identity_type, pervious_identity_type_count))
        return (length, types_list)
    (length, identity_list) = iterable_info(iterable)
    element_types = ""
    for (identity_item_type, identity_item_count) in identity_list:
        if element_types == "":
            pass
        else:
            element_types += ","
        element_types += identity_item_type
        if (identity_item_count != length) and (identity_item_count != 1):
            element_types += "[" + `identity_item_count` + "]"
    result = name + "[" + `length` + "]<" + element_types + ">"
    return result

def type_spec_dict(dict, name):
    def dict_info(dict):
        # With a dict for it to be comparable 
        # the identity must contain the name and length 
        # and for the key and value combinations the type, order and count.
        length = 0
        types_list = []
        pervious_identity_type = None
        pervious_identity_type_count = 0
        first_item_done = False
        for (k, v) in dict.iteritems():
            key_type = type_spec(k)
            value_type = type_spec(v)
            item_type = (key_type, value_type)
            if (item_type != pervious_identity_type):
                if not first_item_done:
                    first_item_done = True
                else:
                    types_list.append((pervious_identity_type, pervious_identity_type_count))
                pervious_identity_type = item_type
                pervious_identity_type_count = 1
            else:
                pervious_identity_type_count += 1
            length += 1
        types_list.append((pervious_identity_type, pervious_identity_type_count))
        return (length, types_list)
    (length, identity_list) = dict_info(dict)
    element_types = ""
    for ((identity_key_type,identity_value_type), identity_item_count) in identity_list:
        if element_types == "":
            pass
        else:
            element_types += ","
        identity_item_type = "(" + identity_key_type + "," + identity_value_type + ")"
        element_types += identity_item_type
        if (identity_item_count != length) and (identity_item_count != 1):
            element_types += "[" + `identity_item_count` + "]"
    result = name + "[" + `length` + "]<" + element_types + ">"
    return result

def type_spec_tuple(tuple, name):
    return name + "<" + ", ".join(type_spec(e) for e in tuple) + ">"

def type_spec(obj):
    object_type = type(obj)
    name = object_type.__name__
    if (object_type is int) or (object_type is long) or (object_type is str) or (object_type is bool) or (object_type is float):            
        result = name
    elif object_type is type(None):
        result = "(none)"
    elif (object_type is list) or (object_type is set):
        result = type_spec_iterable(obj, name)
    elif (object_type is dict):
        result = type_spec_dict(obj, name)
    elif (object_type is tuple):
        result = type_spec_tuple(obj, name)
    else:
        if name == 'ndarray':
            ndarray = obj
            ndarray_shape = "[" + `ndarray.shape`.replace("L","").replace(" ","").replace("(","").replace(")","") + "]"
            ndarray_data_type = `ndarray.dtype`.split("'")[1]
            result = name + ndarray_shape + "<" + ndarray_data_type + ">"
        else:
            result = "Unknown type: " , name
    return result

Ich würde es nicht für erledigt halten, aber es hat an allem funktioniert, was ich bisher brauchte.