INDEX
Explanations
negations and phrases that indicate absence or lack
New Auto-Interp
Negative Logits
aron
-0.17
ãģĹãģŁ
-0.16
ãģĹãģ¾ãģ£ãģŁ
-0.15
ãģ«ãģªãģ£ãģŁ
-0.15
ÑģделаÑĤÑĮ
-0.15
λÏī
-0.15
amber
-0.14
Ñĸли
-0.14
\↵
-0.14
ãĤĮãģŁ
-0.14
POSITIVE LOGITS
DBC
0.17
одÑĥ
0.16
à¤ķरत
0.15
558
0.15
azy
0.15
dle
0.14
ulu
0.14
ÑĭваÑĤÑĮ
0.14
LEX
0.14
abant
0.14
Activations Density 0.087%