INDEX
    Explanations

    statements of denial or contradiction regarding accusations or plans

    New Auto-Interp
    Negative Logits
    orton
    -0.18
    .ml
    -0.16
    ofire
    -0.15
    .mj
    -0.14
    ¶Į
    -0.14
    ?action
    -0.14
    agara
    -0.14
    	Copyright
    -0.14
     gut
    -0.13
    ppo
    -0.13
    POSITIVE LOGITS
     anyone
    0.16
     ANY
    0.16
    ish
    0.16
    yc
    0.15
    äºĭ
    0.15
    inde
    0.15
    sort
    0.15
     any
    0.15
     Fog
    0.15
     Boeh
    0.14
    Act Density 0.127%

    No Known Activations