INDEX
    Explanations

    words related to professions and work environments

    phrases related to law and authority figures

    New Auto-Interp
    Negative Logits
     lamented
    -0.61
     ®
    -0.60
     CLS
    -0.60
    arnaev
    -0.59
     eagerly
    -0.59
     mere
    -0.58
     famed
    -0.58
     Released
    -0.58
     "#
    -0.57
    surprisingly
    -0.57
    POSITIVE LOGITS
     [
    1.26
     ['
    1.25
     everybody
    1.10
     â̦"
    1.08
     ..."
    1.07
    ,"
    1.07
     somebody
    1.06
    .''
    1.06
    ."
    1.05
    ,'"
    1.05
    Act Density 1.337%

    No Known Activations