INDEX
Explanations
the presence of specific formatting or markup elements in the text
New Auto-Interp
Negative Logits
tagHelperRunner
-1.28
SourceChecksum
-1.03
__":
-1.01
__':
-0.99
AddTagHelper
-0.97
nahilalakip
-0.97
Hentet
-0.96
principalTable
-0.96
Roskov
-0.94
تقاوى
-0.93
POSITIVE LOGITS
↵
0.79
0.69
//
0.65
'
0.61
///
0.58
0.58
//
0.58
*
0.57
0.57
0.57
Activations Density 0.369%