INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lessness
    -0.07
    Insert
    -0.07
     کنترل
    -0.06
    <Role
    -0.06
    ('+
    -0.06
    ством
    -0.06
    Errors
    -0.06
    :</
    -0.06
    ITLE
    -0.06
    Castle
    -0.06
    POSITIVE LOGITS
     buzz
    0.08
     Buzz
    0.08
    awk
    0.07
     Bernie
    0.07
    Buzz
    0.07
     Pixar
    0.06
     controversial
    0.06
    _CHAT
    0.06
     privilege
    0.06
    icester
    0.06
    Act Density 0.003%

    No Known Activations