INDEX
Explanations
references to people being addressed or mentioned in communications
New Auto-Interp
Head Attr Weights
0:0.04
1:0.03
2:0.06
3:0.10
4:0.12
5:0.04
6:0.03
7:0.17
8:0.11
9:0.05
10:0.10
11:0.09
Negative Logits
��
-1.88
��
-1.68
pse
-1.60
izen
-1.59
���
-1.57
��
-1.55
Ake
-1.45
lingu
-1.42
Tosh
-1.42
imum
-1.41
POSITIVE LOGITS
vier
1.71
iates
1.45
iors
1.34
receipt
1.32
idon
1.31
irm
1.31
Contributions
1.30
subscriptions
1.29
<<
1.29
stores
1.27
Activations Density 0.001%