INDEX
Explanations
negations or conditions indicating the falsity of a statement or situation
New Auto-Interp
Negative Logits
and
-0.57
5
-0.53
3
-0.52
7
-0.49
<eos>
-0.49
↵↵↵↵
-0.49
;
-0.47
’
-0.47
and
-0.46
والع
-0.46
POSITIVE LOGITS
InjectAttribute
1.02
protoimpl
0.98
enumi
0.92
(!__
0.92
=!
0.90
saraba
0.88
AccessorTable
0.86
GraphicsUnit
0.83
HasBeenSet
0.83
{!0.81
Activations Density 0.041%