INDEX
    Explanations

    words related to confirming or affirming statements

    affirmative responses to questions or statements

    New Auto-Interp
    Negative Logits
    bage
    -0.84
    perial
    -0.70
    inese
    -0.67
    ensis
    -0.65
    RAW
    -0.64
     ILCS
    -0.63
     externalToEVAOnly
    -0.63
    isf
    -0.63
    enegger
    -0.60
    leted
    -0.59
    POSITIVE LOGITS
    terday
    1.73
     sir
    0.81
     Means
    0.78
    ZI
    0.74
    YES
    0.72
    eed
    0.70
    asar
    0.67
    Mi
    0.66
    ñ
    0.65
     matter
    0.65
    Act Density 0.020%

    No Known Activations