INDEX
    Explanations

    words related to hostile or aggressive behavior directed at someone

    New Auto-Interp
    Negative Logits
     boop
    -0.53
     shenan
    -0.52
     becau
    -0.50
     fucker
    -0.49
     kaos
    -0.49
     disagre
    -0.49
     kani
    -0.48
     cuck
    -0.48
     pooh
    -0.47
     excru
    -0.47
    POSITIVE LOGITS
     at
    0.59
     AT
    0.56
     At
    0.56
    At
    0.55
     NKC
    0.53
     dirait
    0.53
     väh
    0.53
    at
    0.50
    UNICIP
    0.49
     дописавши
    0.49
    Act Density 0.127%

    No Known Activations