INDEX
Explanations
negative assessments or worsening conditions
New Auto-Interp
Negative Logits
Tracce
-0.55
ImageIO
-0.54
Strongly
-0.53
nahilalakip
-0.53
ViewFeatures
-0.52
illet
-0.52
存
-0.52
ukunft
-0.51
avs
-0.51
Autoritní
-0.50
POSITIVE LOGITS
worse
3.27
Worse
2.88
worse
2.84
worst
2.84
Worse
2.67
worst
2.45
Worst
2.37
Worst
2.26
peor
2.17
peores
2.11
Activations Density 0.075%