INDEX
Explanations
phrases that indicate uncertainty or conditionality, particularly in decision-making contexts
New Auto-Interp
Negative Logits
Geb
-0.17
-
-0.15
orgia
-0.15
-
-0.15
RS
-0.14
jab
-0.14
x
-0.13
ales
-0.13
FT
-0.13
vice
-0.13
POSITIVE LOGITS
but
0.22
nhưng
0.17
arella
0.17
اÙĨÙĪ
0.16
깨
0.16
âĹĦ
0.16
aber
0.15
AREST
0.15
но
0.15
but
0.15
Activations Density 0.214%