INDEX
Explanations
various forms of punctuation and question marks in text
New Auto-Interp
Negative Logits
RegressionTest
-0.94
-0.72
RegistryLite
-0.67
InjectAttribute
-0.61
<bos>
-0.60
...
-0.56
umano
-0.56
personalidade
-0.55
רבה
-0.54
onAttach
-0.54
POSITIVE LOGITS
?)
1.34
?]
1.34
?,
1.22
?),
1.20
?";
1.20
?):
1.15
?",
1.07
?).
1.07
?;
1.06
!,
1.06
Activations Density 0.198%