INDEX
Explanations
strong, colloquial language or profanity
informal and derogatory language expressing frustration or disdain
New Auto-Interp
Negative Logits
otte
-0.73
ello
-0.70
Edmund
-0.66
Windsor
-0.66
Kepler
-0.64
ovo
-0.62
Townsend
-0.60
osen
-0.60
omen
-0.60
Include
-0.60
POSITIVE LOGITS
shit
3.82
crap
3.27
Shit
2.65
shit
2.56
bullshit
2.21
fuck
2.18
piss
2.01
fucking
2.01
shitty
1.93
stuff
1.84
Activations Density 0.013%