INDEX
Explanations
introducing purpose or consequence
New Auto-Interp
Negative Logits
because
-1.57
and
-1.55
eftersom
-1.43
since
-1.40
that
-1.38
dlatego
-1.27
omdat
-1.23
потому
-1.16
ponieważ
-1.15
protože
-1.13
POSITIVE LOGITS
they
2.31
we
1.92
cuando
1.75
ketika
1.73
можно
1.69
lorsque
1.64
can
1.59
nantinya
1.59
later
1.57
można
1.53
Activations Density 0.029%