INDEX
Explanations
occurrences of the letter "L" in various contexts
New Auto-Interp
Negative Logits
ane
-0.19
aw
-0.18
ine
-0.18
IB
-0.18
im
-0.17
ib
-0.16
ink
-0.16
aus
-0.16
n
-0.16
orem
-0.15
POSITIVE LOGITS
alu
0.21
oris
0.20
iveness
0.20
erne
0.19
ollipop
0.19
ustr
0.18
ateral
0.18
ighth
0.17
iferay
0.17
ombo
0.17
Activations Density 0.104%