INDEX
Explanations
mathematical expressions and scientific notations
New Auto-Interp
Negative Logits
Minus
-0.79
Minus
-0.75
minus
-0.72
descend
-0.72
decreasing
-0.71
downgrade
-0.70
Decreased
-0.70
decreases
-0.69
decreased
-0.69
decrement
-0.67
POSITIVE LOGITS
=+
0.92
,+
0.90
positive
0.90
positives
0.85
Positive
0.82
positively
0.80
}_{+0.80
ThroughAttribute
0.80
Positive
0.80
^{+0.80
Activations Density 0.706%