INDEX
Explanations
conditional phrases or instances of the word "when."
New Auto-Interp
Negative Logits
Bab
-0.53
ib
-0.53
uro
-0.52
Bob
-0.51
bab
-0.51
Bag
-0.50
udo
-0.50
Band
-0.49
cy
-0.49
ache
-0.49
POSITIVE LOGITS
when
2.34
when
2.16
cuando
2.13
quando
2.08
WHEN
1.85
cuando
1.84
kiedy
1.84
когда
1.83
wanneer
1.80
όταν
1.74
Activations Density 0.309%