INDEX
Explanations
proper nouns and references to specific locations or historical figures
New Auto-Interp
Negative Logits
"):
-1.14
'):
-1.11
")));
-1.05
"){
-1.04
"]);
-1.02
")){
-1.01
$")
-1.01
)");
-1.00
'){
-0.99
[]
-0.98
POSITIVE LOGITS
,
2.80
(),
1.17
,
1.17
،
1.14
,
1.13
!,
1.04
$,
1.04
,
1.00
.,
1.00
?,
0.99
Activations Density 9.892%