INDEX
Explanations
references to significant political figures and events
New Auto-Interp
Negative Logits
ãģıãĤĮãģŁ
-0.14
Shutdown
-0.13
oppable
-0.13
aln
-0.13
ģ
-0.13
Bris
-0.13
afen
-0.13
_READONLY
-0.13
skirts
-0.13
disappe
-0.13
POSITIVE LOGITS
inkle
0.20
icking
0.15
ramer
0.14
ASY
0.14
hay
0.14
ida
0.14
.dx
0.14
atur
0.14
yle
0.14
og
0.14
Activations Density 0.066%