INDEX
Explanations
words that indicate a sense of realization or emphasis in the narrative
New Auto-Interp
Negative Logits
hatta
-0.18
eux
-0.15
Ø·ÙĨ
-0.14
whom
-0.13
herself
-0.13
ÑĢеÑī
-0.13
intColor
-0.13
him
-0.13
[â̦
-0.12
ovah
-0.12
POSITIVE LOGITS
there
0.28
it
0.24
there
0.23
they
0.22
,
0.19
forth
0.19
,it
0.18
we
0.18
they
0.17
Ù쨥ÙĨ
0.17
Activations Density 0.310%