INDEX
Explanations
temporal references and indications of time-related events
New Auto-Interp
Negative Logits
him
-0.17
him
-0.16
lui
-0.16
ÙĴÙĩ
-0.16
ersh
-0.15
whats
-0.15
icamente
-0.14
/il
-0.14
ysqli
-0.14
THEM
-0.14
POSITIVE LOGITS
they
0.30
that
0.25
we
0.20
that
0.19
mÃł
0.18
she
0.18
it
0.17
he
0.17
everything
0.17
THEY
0.16
Activations Density 0.063%