INDEX
Explanations
proper nouns and specific events or titles
New Auto-Interp
Negative Logits
acker
-0.17
setattr
-0.17
Tape
-0.16
rz
-0.16
rodu
-0.15
ascimento
-0.15
ilip
-0.14
vou
-0.14
societies
-0.14
_INTERVAL
-0.14
POSITIVE LOGITS
(SS
0.32
(SC
0.23
SS
0.23
(S
0.22
(SP
0.19
NSS
0.18
NSS
0.17
SCT
0.17
.SC
0.17
SS
0.16
Activations Density 0.152%