INDEX
Explanations
names with specific special characters
instances of a specific character or placeholder in text
New Auto-Interp
Negative Logits
anwhile
-0.94
lda
-0.71
itably
-0.70
ctors
-0.70
etheless
-0.68
ioned
-0.68
censor
-0.67
Nam
-0.67
Bermuda
-0.64
Libyan
-0.63
POSITIVE LOGITS
¹
1.22
º
1.13
į
1.13
£
1.13
ı
1.11
¬
1.11
¡
1.08
Ĭ
1.06
Ĵ
1.04
ħ
1.04
Activations Density 0.208%