INDEX
    Explanations

    phrases indicating correctness or affirmation

    New Auto-Interp
    Negative Logits
    richt
    -0.19
     addCriterion
    -0.19
    stÃŃ
    -0.17
    ì²Ļ
    -0.16
     elsewhere
    -0.16
    rights
    -0.16
     Else
    -0.16
    ycz
    -0.15
     Rights
    -0.15
     rights
    -0.14
    POSITIVE LOGITS
     where
    0.20
     next
    0.19
     dab
    0.19
    e
    0.18
    -handed
    0.17
    oyo
    0.17
     alongside
    0.16
    aneously
    0.16
     beside
    0.15
     neben
    0.15
    Act Density 0.030%

    No Known Activations