INDEX
    Explanations

    terms related to political authority and ideology

    New Auto-Interp
    Negative Logits
     unwanted
    -0.54
    rowsiness
    -0.54
     unlikely
    -0.53
     laude
    -0.53
    curios
    -0.53
     almost
    -0.52
    vacy
    -0.52
    Hozzáférés
    -0.52
    curious
    -0.51
    ricated
    -0.50
    POSITIVE LOGITS
     nahilalakip
    0.61
     Jej
    0.53
     hasattr
    0.52
     stället
    0.51
     bandoulière
    0.51
    ($__
    0.51
    makeConstraints
    0.51
     morire
    0.50
     Italijani
    0.50
     surla
    0.50
    Act Density 0.379%

    No Known Activations