INDEX
Explanations
non-English characters in the text
the character "ľ" in various contexts
New Auto-Interp
Negative Logits
condem
-0.75
raints
-0.74
disadvant
-0.73
hemor
-0.70
Seym
-0.69
Instr
-0.63
womb
-0.62
apes
-0.59
misunder
-0.58
unborn
-0.58
POSITIVE LOGITS
ï¸ı
1.20
âĶĢâĶĢ
1.05
âĸł
0.86
conom
0.86
ï¸
0.82
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
0.82
uthor
0.81
°
0.80
0.79
ł
0.79
Activations Density 0.190%