INDEX
    Explanations

    item type or attribute identification

    New Auto-Interp
    Negative Logits
     Елена
    0.48
     ataque
    0.46
     кризи
    0.46
    지와
    0.44
     ambivalent
    0.44
    要注意
    0.43
    aparikkh
    0.43
     divergents
    0.43
     swearing
    0.43
     susceptibles
    0.43
    POSITIVE LOGITS
     type
    0.56
    类型
    0.54
    type
    0.54
     number
    0.52
     Yes
    0.50
     Technology
    0.49
    Type
    0.49
     types
    0.49
    size
    0.49
    類型
    0.49
    Act Density 0.034%

    No Known Activations