INDEX
Explanations
the presence of numbered or listed items within the text
list item markers like a) or 1)
New Auto-Interp
Negative Logits
ckså
-0.59
Мексичка
-0.53
Rhestr
-0.52
zeera
-0.49
skär
-0.48
referrerpolicy
-0.48
שוליים
-0.46
meille
-0.46
ähteet
-0.45
WriteBarrier
-0.44
POSITIVE LOGITS
((
0.60
(((
0.60
((
0.57
('/:0.55
)((
0.50
(((
0.50
similiano
0.50
=((
0.47
[(
0.47
!(
0.46
Activations Density 0.049%