INDEX
Explanations
names of individuals and references to specific people
New Auto-Interp
Negative Logits
ãģ¾ãģŁ
-0.18
th
-0.18
er
-0.17
jedn
-0.16
ãģĤãģ£ãģŁ
-0.16
rosso
-0.15
eric
-0.15
kü
-0.15
ä¿Ĺ
-0.15
aug
-0.15
POSITIVE LOGITS
plorer
0.17
sson
0.17
ilda
0.15
akedirs
0.14
son
0.14
-pane
0.14
Skywalker
0.14
еÑģÑĮ
0.13
ernals
0.13
EDIA
0.13
Activations Density 0.808%