INDEX
    Explanations

    actions involving physical aggression or conflict

    instances of the word "and."

    New Auto-Interp
    Negative Logits
    onymous
    -0.82
    éŃĶ
    -0.78
    :(
    -0.76
    :[
    -0.74
    Ĥİ
    -0.73
    ESE
    -0.72
    uthor
    -0.70
    Student
    -0.70
    BIL
    -0.70
    Commission
    -0.69
    POSITIVE LOGITS
     thus
    0.92
     consequently
    0.92
     therefore
    0.88
     assorted
    0.86
     hence
    0.85
     then
    0.83
     possibly
    0.81
     flats
    0.79
     vice
    0.79
    rogens
    0.78
    Act Density 0.884%

    No Known Activations