INDEX
Explanations
proper nouns related to people and their backgrounds
New Auto-Interp
Negative Logits
oni
-0.15
Polic
-0.14
wan
-0.14
apart
-0.14
xcf
-0.14
tuz
-0.13
impe
-0.13
-0.13
oram
-0.13
æķ£
-0.13
POSITIVE LOGITS
ãĥŃãĥ¼
0.17
lal
0.15
åĨĨ
0.15
elps
0.15
.scalablytyped
0.15
Nolan
0.14
mÃłn
0.14
ÑĢана
0.14
.reserve
0.14
elop
0.14
Activations Density 0.038%