INDEX
Explanations
names of people
references to individuals, particularly names or initials of notable people
New Auto-Interp
Negative Logits
theless
-0.81
minecraft
-0.65
FTA
-0.59
IPS
-0.58
falls
-0.57
itness
-0.57
retake
-0.57
wl
-0.56
heid
-0.55
âĸĪ
-0.55
POSITIVE LOGITS
Gaal
0.92
pez
0.72
bourg
0.70
vez
0.69
uel
0.67
iao
0.64
Gaul
0.63
amph
0.63
leon
0.61
Benz
0.60
Activations Density 0.108%