INDEX
Explanations
conjunctions and instances of the word "and" in various contexts
New Auto-Interp
Negative Logits
xbc
-0.15
arkan
-0.14
vik
-0.14
anie
-0.13
alike
-0.13
ilk
-0.13
olec
-0.13
aquarium
-0.13
_NOP
-0.13
olt
-0.13
POSITIVE LOGITS
there
0.28
there
0.21
we
0.17
THERE
0.17
人们
0.16
everyone
0.15
717
0.15
tere
0.15
atar
0.14
we
0.14
Activations Density 0.242%