INDEX
Explanations
names and titles of individuals or entities
proper nouns or names
New Auto-Interp
Negative Logits
eleph
-0.76
BALL
-0.72
å¤
-0.67
VEN
-0.65
ĵĺ
-0.65
RED
-0.65
Vert
-0.65
Percy
-0.64
2020
-0.63
©¶æ
-0.63
POSITIVE LOGITS
i
1.94
iologist
1.23
ei
1.18
iak
1.16
ih
1.10
iology
1.09
ii
1.08
iator
1.05
ie
0.99
i
0.99
Activations Density 0.100%