INDEX
Explanations
the word 'which' in varying contexts
New Auto-Interp
Negative Logits
Olms
-0.69
ed
-0.67
Juneau
-0.67
ded
-0.66
cy
-0.63
Baton
-0.63
Coss
-0.62
Magee
-0.61
Folsom
-0.60
Hov
-0.60
POSITIVE LOGITS
WHICH
1.25
Which
1.24
Which
1.17
which
1.14
which
1.13
wich
1.11
Datuak
1.09
]**
1.05
']))
1.04
hich
0.96
Activations Density 0.153%