INDEX
Explanations
relatively manageable states
New Auto-Interp
Negative Logits
Invalid
0.86
incomplete
0.86
incomplete
0.84
незакон
0.82
Invalid
0.81
Unable
0.80
неправи
0.80
leider
0.78
Incorrect
0.78
illegally
0.77
POSITIVE LOGITS
manageable
2.16
managable
2.08
relatively
1.86
tolerable
1.81
harmless
1.77
benign
1.68
relatively
1.68
比較的
1.62
Relatively
1.62
milder
1.61
Activations Density 0.556%