INDEX
    Explanations

    instances of the word "disagree."

    New Auto-Interp
    Negative Logits
    odial
    -0.14
    /Private
    -0.14
    483
    -0.14
    θμ
    -0.14
    okit
    -0.13
     tabIndex
    -0.13
    .converter
    -0.13
    dech
    -0.13
    ató
    -0.13
    ĴĮ
    -0.13
    POSITIVE LOGITS
    710
    0.15
    ota
    0.15
    orks
    0.14
    ÃŃda
    0.14
    pv
    0.14
    chwitz
    0.14
    avig
    0.14
    SSIP
    0.14
    ida
    0.13
    _LR
    0.13
    Act Density 0.002%

    No Known Activations