INDEX
Explanations
punctuations and quotations used in the text
New Auto-Interp
Negative Logits
'));
-1.12
)";
-1.00
]');
-0.97
`,
-0.96
}');
-0.94
)");
-0.94
.";
-0.93
'):
-0.93
")));
-0.91
%");
-0.91
POSITIVE LOGITS
("1.56
]=="
1.44
"
1.40
="
1.36
:@"
1.34
!("1.31
“
1.25
('1.22
_("1.19
["
1.18
Activations Density 0.148%