INDEX
Explanations
multiple occurrences of punctuation marks
New Auto-Interp
Negative Logits
2
-0.70
1
-0.61
3
-0.61
5
-0.53
4
-0.52
9
-0.52
7
-0.51
𝙫
-0.51
8
-0.50
-
-0.48
POSITIVE LOGITS
.$,
1.51
,:),
1.25
,-,
1.23
,",
1.23
,
1.20
!("{}",1.20
,<
1.17
°,
1.16
€,
1.16
\%,
1.15
Activations Density 0.632%