INDEX
Explanations
the word "depending" and its variations, indicating context or conditionality
New Auto-Interp
Negative Logits
Ved
-0.15
opis
-0.15
er
-0.15
oner
-0.14
ghi
-0.14
sko
-0.14
tras
-0.14
pis
-0.13
rese
-0.13
xde
-0.13
POSITIVE LOGITS
upon
0.29
upon
0.22
Upon
0.21
Upon
0.20
whether
0.20
how
0.17
æĸ¼
0.17
<|begin_of_text|>
0.17
ạnh
0.17
whereabouts
0.16
Activations Density 0.011%