INDEX
Explanations
proper nouns
words related to individual identities or names
New Auto-Interp
Negative Logits
spring
-0.84
matic
-0.79
maiden
-0.72
sets
-0.67
rises
-0.66
animous
-0.65
graph
-0.64
lime
-0.62
mable
-0.61
north
-0.60
POSITIVE LOGITS
chnology
1.29
ete
1.25
eme
1.01
lements
0.95
uve
0.91
opol
0.88
zos
0.85
elist
0.85
anu
0.83
lde
0.81
Activations Density 0.014%