INDEX
Explanations
names of individuals with varying levels of importance
proper nouns, particularly names of people
New Auto-Interp
Negative Logits
Hispan
-0.68
Slovenia
-0.67
Hornets
-0.67
Clarkson
-0.65
spoiler
-0.65
Brav
-0.63
Catalonia
-0.62
overload
-0.61
narrowing
-0.59
Sloven
-0.59
POSITIVE LOGITS
Jr
1.12
nr
0.91
uez
0.88
III
0.83
Sr
0.82
inguished
0.81
ensen
0.79
fman
0.79
itars
0.78
grave
0.78
Activations Density 0.234%