INDEX
Explanations
first-person pronouns and expressions of personal thoughts or feelings
New Auto-Interp
Negative Logits
Ñĩи
-0.17
uki
-0.15
umba
-0.14
mong
-0.14
oco
-0.14
oken
-0.14
éĬĢ
-0.13
Hud
-0.13
legates
-0.13
Toni
-0.13
POSITIVE LOGITS
/REC
0.16
-clock
0.16
èm
0.15
akat
0.14
à¹ģà¸Ļ
0.14
Wells
0.14
wel
0.14
Cha
0.13
bens
0.13
iez
0.13
Activations Density 0.187%