INDEX
Explanations
names of individuals
proper nouns, particularly names and titles
New Auto-Interp
Negative Logits
âĶĢâĶĢ
-0.79
FORMATION
-0.70
mbuds
-0.66
ournal
-0.65
ECO
-0.63
ANGEL
-0.62
conom
-0.61
EXP
-0.61
BOX
-0.61
onies
-0.61
POSITIVE LOGITS
baugh
1.14
ansky
1.04
etti
1.01
love
0.99
ley
0.98
enberg
0.96
hart
0.96
opoulos
0.94
quist
0.93
inski
0.91
Activations Density 0.408%