INDEX
Explanations
mentions of actions or descriptions related to misconduct
instances of the word "foul" in various contexts
New Auto-Interp
Negative Logits
_>
-0.79
edia
-0.78
akeru
-0.75
UNCH
-0.72
aeda
-0.70
hare
-0.69
Downloadha
-0.69
igslist
-0.67
ppelin
-0.67
ocobo
-0.67
POSITIVE LOGITS
cery
0.91
terness
0.83
smelling
0.82
foul
0.82
s
0.77
sie
0.75
nesses
0.73
misc
0.71
weather
0.70
ness
0.67
Activations Density 0.013%