INDEX
Explanations
the pronoun "It" in various contexts
New Auto-Interp
Negative Logits
noc
-0.17
ember
-0.16
Trails
-0.16
trails
-0.15
phia
-0.15
stuff
-0.15
str
-0.15
iaÅĤa
-0.15
lore
-0.15
ergus
-0.14
POSITIVE LOGITS
ador
0.17
oret
0.15
odore
0.15
odor
0.15
èĥ
0.15
une
0.14
ulus
0.14
Es
0.14
_Lean
0.13
retim
0.13
Activations Density 0.211%