INDEX
Explanations
occurrences of specific keyword phrases in a foreign language, predominantly related to identity or existential states
New Auto-Interp
Negative Logits
à¥įयवस
-0.15
bee
-0.14
ode
-0.14
/ts
-0.14
áže
-0.14
κον
-0.14
Ebony
-0.13
urr
-0.13
seedu
-0.13
obo
-0.13
POSITIVE LOGITS
ìĤ¬íķŃ
0.19
аÑĤелÑĮно
0.18
eting
0.17
ìĤ¬íķŃ
0.17
remark
0.15
ÑĮÑı
0.15
rames
0.15
amo
0.15
atrix
0.15
GINE
0.14
Activations Density 0.005%