INDEX
Explanations
names and terms related to specific individuals
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
phabet
-0.74
ACTED
-0.71
Ü
-0.67
spirited
-0.66
suicidal
-0.63
mileage
-0.62
temptation
-0.62
Scots
-0.61
combustion
-0.61
IDES
-0.61
POSITIVE LOGITS
cz
1.31
ynski
1.19
ewski
1.06
ansky
0.99
arella
0.97
ota
0.97
oslov
0.96
ombie
0.96
owski
0.94
anski
0.94
Activations Density 0.008%