INDEX
Explanations
mentioned age-related details or references to years
New Auto-Interp
Head Attr Weights
0:0.22
1:0.10
2:0.05
3:0.04
4:0.03
5:0.09
6:0.10
7:0.03
8:0.05
9:0.06
10:0.09
11:0.11
Negative Logits
netflix
-1.83
soDeliveryDate
-1.58
shall
-1.48
emphasizing
-1.39
quickShipAvailable
-1.37
ashtra
-1.36
uden
-1.35
emphas
-1.35
urers
-1.34
conserve
-1.34
POSITIVE LOGITS
jamin
1.80
-
1.70
-$
1.60
‐
1.60
َ
1.59
.-
1.55
Md
1.55
\.
1.54
‑
1.53
-'
1.53
Activations Density 0.002%