INDEX
    Explanations

    Disagreement/Uncertainty

    New Auto-Interp
    Negative Logits
    nais
    -0.09
    esm
    -0.08
     verdie
    -0.08
     Performs
    -0.08
     മേ
    -0.08
    relse
    -0.07
     oziroma
    -0.07
    firma
    -0.07
     prakt
    -0.07
    gegevens
    -0.07
    POSITIVE LOGITS
     guessed
    0.11
     guessing
    0.10
     guess
    0.10
     به
    0.09
    0.09
     guesses
    0.08
    Guess
    0.08
     plausible
    0.08
    _guess
    0.08
     уга
    0.08
    Act Density 0.046%

    No Known Activations