INDEX
Explanations
phrases indicating choices or alternatives
New Auto-Interp
Negative Logits
iro
-0.17
rong
-0.16
ancellable
-0.15
ONGL
-0.15
ãĥªãĥ¼ãĤº
-0.14
ajar
-0.14
Batt
-0.14
eger
-0.14
ere
-0.14
erro
-0.13
POSITIVE LOGITS
or
0.26
ê±°ëĤĺ
0.23
ï¼ĮæĪĸ
0.23
directly
0.22
hoặc
0.22
oder
0.22
æĪĸ
0.20
либо
0.20
или
0.20
nebo
0.19
Activations Density 0.053%