INDEX
Explanations
instances of the verb "put" in various forms
New Auto-Interp
Negative Logits
kv
-0.15
itez
-0.14
rsa
-0.14
/from
-0.14
imeo
-0.14
çĨŁ
-0.14
ials
-0.14
ories
-0.14
sms
-0.14
enders
-0.14
POSITIVE LOGITS
forth
0.39
aside
0.35
together
0.33
atively
0.31
tering
0.26
ter
0.25
tered
0.25
forth
0.24
forward
0.23
aside
0.23
Activations Density 0.042%