INDEX
Explanations
punctuation, particularly commas and apostrophes
New Auto-Interp
Negative Logits
-
-0.66
'
-0.61
’
-0.55
2
-0.53
1
-0.53
Jack
-0.52
Jack
-0.52
of
-0.48
R
-0.47
3
-0.47
POSITIVE LOGITS
.$,
1.27
!("{}",1.27
__',
1.25
OGND
1.25
\"",
1.22
)",
1.21
>",
1.20
>=",
1.19
}",
1.18
,",
1.18
Activations Density 0.394%