INDEX
Explanations
punctuation marks and symbols
New Auto-Interp
Negative Logits
'
-0.53
-
-0.52
’
-0.50
ítě
-0.46
ič
-0.45
kuto
-0.45
𝙫
-0.44
Jack
-0.43
の大
-0.43
of
-0.42
POSITIVE LOGITS
!("{}",1.20
)",
1.18
.$,
1.18
OGND
1.14
\"",
1.13
__',
1.13
?",
1.13
;",
1.12
:",
1.12
?}",
1.11
Activations Density 0.466%