INDEX
    Explanations

    statements expressing a point of view or making a claim

    New Auto-Interp
    Negative Logits
     boks
    -0.60
     erk
    -0.59
     traktor
    -0.59
    Rgds
    -0.59
     koc
    -0.56
     stik
    -0.55
     anse
    -0.54
     reger
    -0.54
     skr
    -0.54
     spion
    -0.53
    POSITIVE LOGITS
    .-"
    0.64
     shayari
    0.61
     🤣🤣
    0.60
     😭😭
    0.59
     milf
    0.57
     CARTOON
    0.56
     ciebie
    0.56
     soulign
    0.55
     faggot
    0.55
     theirs
    0.55
    Act Density 0.298%

    No Known Activations