INDEX
Explanations
contrasts or comparisons between concepts or entities
exceptions or differences
but usebut evenbut withbut inbut differsexcept
New Auto-Interp
Negative Logits
象
-0.46
asc
-0.43
hak
-0.42
PYX
-0.41
das
-0.41
主
-0.41
ash
-0.41
ře
-0.41
par
-0.40
iman
-0.40
POSITIVE LOGITS
只不过
1.19
Except
0.99
except
0.95
Except
0.92
difference
0.86
不过是
0.84
except
0.84
diferença
0.80
LookAnd
0.79
verschil
0.79
Activations Density 0.404%