INDEX
Explanations
quotation marks and direct speech within the text
New Auto-Interp
Negative Logits
838
-0.15
_PTR
-0.15
Tome
-0.15
exual
-0.15
McKay
-0.14
ÑģÑĤÑĢи
-0.14
ooled
-0.14
Johnston
-0.14
ij
-0.14
aises
-0.14
POSITIVE LOGITS
Weiss
0.14
avia
0.14
woff
0.14
amba
0.14
ÙĨاÙĨ
0.14
eland
0.13
Samantha
0.13
oÄŁ
0.13
ccd
0.13
icket
0.13
Activations Density 0.079%