INDEX
    Explanations

    affirmative and negative responses or indications related to decisions

    New Auto-Interp
    Negative Logits
    anth
    -0.15
    agt
    -0.15
    ssi
    -0.15
    empo
    -0.14
    gos
    -0.14
     ha
    -0.14
    ØŃÙĩ
    -0.14
    aine
    -0.14
    anche
    -0.13
    andin
    -0.13
    POSITIVE LOGITS
     stavu
    0.15
     Helm
    0.14
    olini
    0.14
    Ïīμα
    0.14
    arem
    0.14
    VERTISEMENT
    0.14
    iona
    0.14
    abella
    0.13
    Hip
    0.13
     timeval
    0.13
    Act Density 0.030%

    No Known Activations