INDEX
Explanations
phrases indicating positive outcomes or successful situations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.00
2:0.09
3:0.04
4:0.14
5:0.03
6:0.04
7:0.39
8:0.02
9:0.03
10:0.09
11:0.05
Negative Logits
umb
-1.60
limitation
-1.55
contracted
-1.54
exclusive
-1.54
scope
-1.49
closure
-1.48
necessity
-1.47
rette
-1.45
specifically
-1.45
terminated
-1.44
POSITIVE LOGITS
Flavoring
1.87
ratings
1.72
Votes
1.71
Ranking
1.61
Merit
1.60
�
1.56
Ratings
1.56
Medals
1.54
Prediction
1.53
trophies
1.52
Activations Density 0.013%