INDEX
Explanations
references to authorship and literary analysis
New Auto-Interp
Negative Logits
oku
-0.17
çĶ
-0.15
äºľ
-0.14
fx
-0.14
RTL
-0.14
Speed
-0.14
NAS
-0.13
MAS
-0.13
-0.13
Fuck
-0.13
POSITIVE LOGITS
texts
0.18
texts
0.16
securely
0.16
Margins
0.15
textual
0.15
/text
0.15
Registers
0.15
early
0.15
misog
0.15
dossier
0.14
Activations Density 0.078%