INDEX
Explanations
references to named entities, particularly those that are associated with food or cultural concepts
New Auto-Interp
Negative Logits
Чи
-0.16
ourn
-0.14
ìĤ¬ìĿ´
-0.13
мм
-0.13
à¹ģà¸Ļ
-0.13
xa
-0.13
adlı
-0.13
etik
-0.13
çģ
-0.13
uli
-0.13
POSITIVE LOGITS
simply
0.42
simplement
0.31
пÑĢоÑģÑĤо
0.29
"
0.27
:
0.24
kıs
0.22
-called
0.21
prostÄĽ
0.21
merely
0.21
Simply
0.21
Activations Density 0.278%