INDEX
Explanations
emotional and sentiment-driven language related to personal experiences and community events
New Auto-Interp
Head Attr Weights
0:0.02
1:0.05
2:0.09
3:0.03
4:0.02
5:0.05
6:0.05
7:0.07
8:0.30
9:0.09
10:0.06
11:0.13
Negative Logits
Compat
-1.41
Flo
-1.24
proxies
-1.17
dp
-1.13
Catal
-1.05
Dial
-1.02
FP
-1.01
TPP
-1.00
metry
-1.00
Proxy
-0.99
POSITIVE LOGITS
esson
1.26
ecause
1.19
ership
1.12
opener
1.08
enburg
1.04
antha
1.04
stint
1.03
inventoryQuantity
1.01
iership
1.01
midterm
1.00
Activations Density 0.019%