INDEX
Explanations
references to popular science fiction franchises and their authors
New Auto-Interp
Negative Logits
Trilogy
-0.14
anst
-0.14
leh
-0.14
668
-0.14
eo
-0.14
165
-0.14
259
-0.14
Trom
-0.13
amedi
-0.13
799
-0.13
POSITIVE LOGITS
tie
0.34
Tie
0.29
tie
0.27
spin
0.25
oficial
0.24
Official
0.23
official
0.23
authorized
0.22
Official
0.22
spin
0.21
Activations Density 0.147%