INDEX
    Explanations

    references to physical violence or abusive behavior

    New Auto-Interp
    Negative Logits
     tÃŃch
    -0.16
    abo
    -0.16
    tuk
    -0.15
    ira
    -0.15
     íĦ°
    -0.15
     tuyá»ĥn
    -0.14
    521
    -0.14
    rijk
    -0.14
    Stick
    -0.14
    pak
    -0.13
    POSITIVE LOGITS
     pinned
    0.18
     struggling
    0.17
    aspers
    0.16
     struggles
    0.16
     struggle
    0.16
    submission
    0.16
     hold
    0.16
     suff
    0.15
     Dân
    0.15
     Hold
    0.15
    Act Density 0.058%

    No Known Activations