INDEX
Explanations
instances of the word "it."
New Auto-Interp
Negative Logits
Its
-0.24
Its
-0.24
its
-0.24
оно
-0.19
orf
-0.17
its
-0.17
itself
-0.17
borg
-0.17
nó
-0.17
Ñıке
-0.17
POSITIVE LOGITS
ty
0.20
chin
0.20
raining
0.18
chy
0.18
cono
0.17
ching
0.16
snow
0.15
alian
0.15
Happ
0.15
rain
0.15
Activations Density 0.210%