INDEX
Explanations
profanity and oaths
instances of swearing or cursing
New Auto-Interp
Negative Logits
Northwestern
-0.67
Pioneer
-0.67
Fab
-0.67
Seed
-0.66
seed
-0.64
prototype
-0.64
Cater
-0.63
Safari
-0.63
erella
-0.63
Marin
-0.61
POSITIVE LOGITS
swearing
4.00
cursing
2.35
yelling
1.43
swore
1.36
swear
1.36
shouting
1.27
perjury
1.19
uty
1.16
barking
1.14
spitting
1.14
Activations Density 0.038%