INDEX
    Explanations

    contradictory statements or surprising admissions

    New Auto-Interp
    Negative Logits
    ž
    -0.15
    æ¶Ī
    -0.14
    eca
    -0.14
    vailability
    -0.14
    ÑģиÑĤ
    -0.14
    antz
    -0.13
    बल
    -0.13
    clr
    -0.13
    .setParameter
    -0.13
    plor
    -0.13
    POSITIVE LOGITS
     but
    0.21
    but
    0.18
     (?
    0.15
    اÙħÙĩ
    0.15
    *:
    0.15
     confession
    0.14
    MD
    0.14
     (?)
    0.14
    697
    0.14
    375
    0.14
    Act Density 0.114%

    No Known Activations