INDEX
    Explanations

    words related to forceful and destructive actions

    words related to violence or physical force

    New Auto-Interp
    Negative Logits
    otype
    -0.69
    otes
    -0.69
     Leilan
    -0.67
     Wise
    -0.67
    æĥ
    -0.67
    redo
    -0.66
    DragonMagazine
    -0.65
    abet
    -0.65
    isans
    -0.64
     Serving
    -0.64
    POSITIVE LOGITS
     fists
    0.93
     dunk
    0.86
    lished
    0.84
     pounded
    0.82
     against
    0.81
     smack
    0.79
     brakes
    0.78
    onite
    0.78
     elbows
    0.76
     hitters
    0.76
    Act Density 0.103%

    No Known Activations