INDEX
Explanations
the pronoun "it" in various contexts
New Auto-Interp
Negative Logits
ollo
-0.18
ãĤ¢ãĥ¼
-0.15
deaux
-0.15
ghan
-0.15
aturday
-0.15
usch
-0.15
nock
-0.15
254
-0.14
cken
-0.14
šti
-0.14
POSITIVE LOGITS
appears
0.24
appear
0.23
turns
0.23
remains
0.22
trans
0.22
Emer
0.22
emerged
0.21
emerge
0.21
remain
0.20
Emerging
0.19
Activations Density 0.094%