INDEX
    Explanations

    expressions of political statements or assertions

    New Auto-Interp
    Negative Logits
    reten
    -0.15
    getManager
    -0.14
    iets
    -0.14
    CJK
    -0.14
    sembl
    -0.14
     NUIT
    -0.13
    bservice
    -0.13
     mesel
    -0.13
    ;break
    -0.13
    /cpu
    -0.13
    POSITIVE LOGITS
    ta
    0.18
     FD
    0.16
    TA
    0.15
    DS
    0.14
     [
    0.14
    Ë
    0.13
    [s
    0.13
    eref
    0.13
    nam
    0.13
     gamb
    0.13
    Act Density 0.149%

    No Known Activations