INDEX
Explanations
proper nouns, particularly names of individuals
New Auto-Interp
Negative Logits
pedia
-0.16
ollo
-0.15
/Gate
-0.14
ãģ¤ãģ¶
-0.14
Arthur
-0.14
reportedly
-0.14
isque
-0.14
ê¼
-0.14
анка
-0.13
sworth
-0.13
POSITIVE LOGITS
pers
0.19
yme
0.15
847
0.15
osob
0.14
urt
0.14
ustain
0.14
lep
0.14
завеÑĢ
0.13
.componentInstance
0.13
ensing
0.13
Activations Density 0.002%