INDEX
Explanations
instances of the word "left" in various contexts
New Auto-Interp
Negative Logits
åύ
-0.16
hit
-0.15
rna
-0.15
ars
-0.15
UZ
-0.14
lis
-0.14
inz
-0.14
habit
-0.14
ewire
-0.14
startup
-0.14
POSITIVE LOGITS
-handed
0.27
wing
0.27
-wing
0.25
-hand
0.23
/right
0.22
enant
0.22
ists
0.21
overs
0.20
ward
0.20
wards
0.20
Activations Density 0.034%