INDEX
Explanations
occurrences of commas in the text
New Auto-Interp
Negative Logits
eux
-0.19
THEM
-0.17
them
-0.17
ниÑħ
-0.16
lui
-0.15
нÑĮого
-0.15
ragaz
-0.14
them
-0.14
ØŃاÙĦÛĮ
-0.14
него
-0.13
POSITIVE LOGITS
there
0.50
it
0.46
we
0.35
there
0.35
they
0.28
you
0.26
many
0.26
Ù쨥ÙĨ
0.25
,it
0.25
nothing
0.25
Activations Density 0.501%