INDEX
Explanations
names of individuals involved in legal cases or incidents
New Auto-Interp
Negative Logits
ço
-0.15
bern
-0.15
rink
-0.15
ests
-0.15
rouw
-0.15
animated
-0.14
anco
-0.14
Narrated
-0.14
Bath
-0.14
clus
-0.14
POSITIVE LOGITS
arius
0.22
quan
0.21
ahn
0.20
onte
0.20
reece
0.19
onn
0.19
zell
0.19
errick
0.19
onna
0.18
airie
0.18
Activations Density 0.095%