INDEX
Explanations
profanities and insults
expressions of strong frustration or anger
New Auto-Interp
Negative Logits
concess
-0.77
preliminary
-0.75
raq
-0.75
authorization
-0.75
migr
-0.75
correspond
-0.74
enrol
-0.73
ön
-0.73
lique
-0.72
rium
-0.72
POSITIVE LOGITS
Seriously
1.37
Fuck
1.26
FUCK
1.22
Seriously
1.20
Especially
1.17
Fuck
1.16
Anyway
1.16
Honestly
1.14
Sorry
1.14
Stupid
1.12
Activations Density 0.698%