INDEX
Explanations
occurrences of social and political commentary
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.13
3:0.26
4:0.14
5:0.02
6:0.02
7:0.05
8:0.05
9:0.06
10:0.10
11:0.07
Negative Logits
assic
-1.52
accounted
-1.52
"?
-1.43
guiActiveUn
-1.43
outwe
-1.39
"}
-1.32
lance
-1.31
artney
-1.30
…"
-1.30
"},
-1.29
POSITIVE LOGITS
illustrating
1.30
discusses
1.29
:]
1.28
illustrates
1.28
leans
1.25
dich
1.24
succinct
1.24
explores
1.23
revealing
1.23
reader
1.23
Activations Density 0.612%