INDEX
Explanations
occurrences of code expressions related to conditional checks
New Auto-Interp
Negative Logits
and
-0.60
2
-0.53
[toxicity=0]
-0.52
5
-0.51
↵↵
-0.50
C
-0.50
and
-0.50
3
-0.50
x
-0.50
values
-0.49
POSITIVE LOGITS
(!
1.25
AccessorTable
1.20
(!__
1.18
InjectAttribute
1.11
(!
1.07
]--;
1.07
AssemblyCulture
1.02
=!
1.00
CloseOperation
1.00
+#+#
1.00
Activations Density 0.024%