INDEX
    Explanations

    contradictory statements or negations in discussions

    New Auto-Interp
    Negative Logits
    loor
    -0.17
    Ïĩη
    -0.17
    ONT
    -0.16
    ож
    -0.16
    ards
    -0.16
    ège
    -0.15
    æĪ¸
    -0.15
     floor
    -0.15
    ardin
    -0.15
     Nim
    -0.14
    POSITIVE LOGITS
    .dtd
    0.19
    ode
    0.17
    ADX
    0.16
    leta
    0.15
    à¥Ģल
    0.15
    (Scene
    0.15
    AspectRatio
    0.14
    MODE
    0.14
    345
    0.14
    paralle
    0.14
    Act Density 0.005%

    No Known Activations