INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.07
2:0.08
3:0.09
4:0.09
5:0.08
6:0.08
7:0.07
8:0.06
9:0.07
10:0.08
11:0.09
Negative Logits
Reviewer
-1.81
ウス
-1.67
raltar
-1.60
idel
-1.58
itud
-1.44
atu
-1.36
ß
-1.35
COMPLE
-1.34
アル
-1.33
synt
-1.33
POSITIVE LOGITS
CNBC
1.48
exaggeration
1.45
NEWS
1.37
Whip
1.34
tweeting
1.28
skew
1.28
Deg
1.27
Journalists
1.27
Coffin
1.26
viewership
1.26
Activations Density 0.000%
No Known Activations
This feature has no known activations.