INDEX
Explanations
the word "some" and related phrases indicating generality or ambiguity
New Auto-Interp
Negative Logits
swer
-0.16
inand
-0.16
ataire
-0.15
ateur
-0.15
ullam
-0.14
ecycle
-0.14
uzzi
-0.14
bourg
-0.14
ière
-0.14
abra
-0.14
POSITIVE LOGITS
ones
0.24
others
0.22
others
0.21
who
0.18
/all
0.17
parts
0.17
acci
0.16
extent
0.15
who
0.15
kinds
0.15
Activations Density 0.086%