INDEX
Explanations
references to individuals with the title "Mr."
New Auto-Interp
Head Attr Weights
0:0.06
1:0.02
2:0.05
3:0.07
4:0.05
5:0.09
6:0.03
7:0.02
8:0.06
9:0.13
10:0.10
11:0.26
Negative Logits
purch
-1.83
counterfe
-1.77
dow
-1.73
merce
-1.71
boycott
-1.70
buyers
-1.70
buyer
-1.70
adul
-1.69
ownership
-1.68
purchase
-1.66
POSITIVE LOGITS
も
1.69
ocious
1.67
Archdemon
1.56
Duration
1.55
=~=~
1.52
Phys
1.49
Cooldown
1.49
Pavel
1.49
ο
1.48
))))
1.48
Activations Density 0.028%