INDEX
Explanations
second-person pronouns and phrases that engage the reader directly
New Auto-Interp
Negative Logits
acro
-0.08
åŃĺäºİ
-0.07
ÑĢаÑħ
-0.07
imli
-0.07
ahat
-0.06
pei
-0.06
jac
-0.06
YM
-0.06
jee
-0.06
Jaune
-0.06
POSITIVE LOGITS
enjoyment
0.07
to
0.07
proto
0.07
éĮ
0.07
Proto
0.06
OLON
0.06
443
0.06
Ñĩенко
0.06
nings
0.06
kie
0.06
Activations Density 0.007%