INDEX
Explanations
occurrences of the letter 'L' in various contexts
New Auto-Interp
Negative Logits
ots
-0.20
aptop
-0.18
ikes
-0.18
ike
-0.18
ABS
-0.17
ance
-0.17
ink
-0.16
abs
-0.16
ift
-0.16
ocking
-0.16
POSITIVE LOGITS
lund
0.20
usat
0.18
om
0.17
elow
0.17
el
0.17
estring
0.16
ekim
0.16
Shields
0.16
utos
0.15
ill
0.15
Activations Density 0.031%