INDEX
Explanations
personal pronouns and expressions of intention
New Auto-Interp
Negative Logits
âĸ²
-0.15
rubbish
-0.15
ije
-0.15
lege
-0.14
ouce
-0.14
leans
-0.14
iji
-0.13
ÙĤÙĩ
-0.13
onta
-0.13
bab
-0.13
POSITIVE LOGITS
recently
0.21
faced
0.21
facing
0.20
faces
0.18
face
0.18
have
0.18
Recently
0.17
éĿ¢
0.17
faces
0.17
trying
0.17
Activations Density 0.064%