INDEX
Explanations
names of individuals associated with allegations or incidents marked by inappropriate physical contact
names and references related to allegations and accusations of inappropriate behavior
New Auto-Interp
Negative Logits
irtual
-0.74
istically
-0.72
OTS
-0.72
ODUCT
-0.70
Sakuya
-0.69
IDE
-0.68
ATHER
-0.68
worldly
-0.68
ynt
-0.67
ISION
-0.67
POSITIVE LOGITS
heimer
1.27
Franken
0.98
Ò
0.91
ste
0.85
berger
0.84
bourg
0.83
steen
0.82
fur
0.81
thal
0.81
furt
0.80
Activations Density 0.016%