INDEX
Explanations
references to subscription boxes or giveaways
New Auto-Interp
Head Attr Weights
0:0.05
1:0.03
2:0.05
3:0.06
4:0.06
5:0.03
6:0.08
7:0.24
8:0.05
9:0.05
10:0.11
11:0.15
Negative Logits
warranties
-1.53
Tribunal
-1.51
ERA
-1.47
unanim
-1.46
licens
-1.44
arbitration
-1.43
Nobel
-1.41
rulings
-1.41
reservation
-1.41
dismissal
-1.41
POSITIVE LOGITS
ults
1.49
irlf
1.46
feces
1.42
endif
1.38
imgur
1.36
bp
1.36
Females
1.36
taboola
1.34
github
1.34
anth
1.33
Activations Density 0.000%