INDEX
Explanations
technical terms for errors
phrases indicating causation or choice
forms of "that"
New Auto-Interp
Negative Logits
which
-1.54
which
-1.51
Which
-1.44
Which
-1.43
WHICH
-1.31
laquelle
-1.05
quale
-1.00
cui
-0.98
cual
-0.96
lesquelles
-0.93
POSITIVE LOGITS
that
1.48
that
0.91
bahwa
0.86
rằng
0.86
bahawa
0.71
kwamba
0.70
ότι
0.67
że
0.64
That
0.61
thut
0.59
Activations Density 4.984%