INDEX
    Explanations

    attraction, desirable, strengths

    New Auto-Interp
    Negative Logits
    0.46
    গুরু
    0.44
     temor
    0.43
     disminución
    0.43
     recuer
    0.42
     forskjellige
    0.41
    цької
    0.40
     autre
    0.40
     zdroj
    0.39
     decenas
    0.39
    POSITIVE LOGITS
    ana
    0.48
    biology
    0.48
    arian
    0.46
    pokemon
    0.46
    onge
    0.44
    chemy
    0.43
    ans
    0.43
    Pokémon
    0.42
    lt
    0.42
    0.42
    Act Density 0.011%

    No Known Activations