INDEX
Explanations
repeated phrases that emphasize inclusivity and universal aspects
New Auto-Interp
Negative Logits
each
-0.68
ciascuno
-0.67
każ
-0.66
すべて
-0.66
quelconque
-0.66
all
-0.66
모두
-0.65
ognuno
-0.64
algunos
-0.63
Each
-0.62
POSITIVE LOGITS
together
0.87
though
0.84
uding
0.83
kinds
0.80
sorts
0.79
three
0.78
ograft
0.76
rounder
0.75
else
0.75
ergies
0.74
Activations Density 0.223%