INDEX
Explanations
news and commentary text on various topics, such as advertising, politics, social issues, sports, personal reflections, entertainment, and technology
instances of specific social or cultural references
New Auto-Interp
Head Attr Weights
0:0.11
1:0.05
2:0.08
3:0.08
4:0.05
5:0.14
6:0.08
7:0.03
8:0.07
9:0.10
10:0.10
11:0.04
Negative Logits
agre
-2.01
enthusi
-1.94
psychiat
-1.79
exha
-1.78
conclud
-1.68
advoc
-1.68
PDATE
-1.67
thous
-1.64
toget
-1.59
nodd
-1.57
POSITIVE LOGITS
sky
1.43
Lam
1.35
Zimmer
1.33
Moment
1.33
hus
1.30
Profile
1.26
Spiel
1.23
silver
1.22
\":
1.21
CM
1.19
Activations Density 0.089%