INDEX
Explanations
phrases indicating possession or attribution in various contexts
New Auto-Interp
Negative Logits
Them
-0.17
Them
-0.17
Proud
-0.15
001
-0.14
çĵľ
-0.14
ilan
-0.14
Himself
-0.14
lie
-0.14
YNC
-0.14
fruition
-0.13
POSITIVE LOGITS
being
0.31
having
0.23
being
0.22
Being
0.20
Being
0.19
how
0.19
among
0.19
contributions
0.18
always
0.18
essere
0.18
Activations Density 0.053%