INDEX
Explanations
references to trending topics and significant events
New Auto-Interp
Head Attr Weights
0:0.06
1:0.08
2:0.09
3:0.05
4:0.04
5:0.05
6:0.22
7:0.03
8:0.04
9:0.23
10:0.04
11:0.02
Negative Logits
Mir
-4.43
Barnett
-3.92
Mir
-3.91
Miranda
-3.62
enclave
-3.42
Guan
-3.37
Gilbert
-3.28
untu
-3.19
Kira
-3.17
Gil
-3.12
POSITIVE LOGITS
Trend
7.42
Trend
6.94
trend
5.97
trends
5.92
trending
5.47
Trends
5.18
tra
3.69
Bold
3.67
Tact
3.35
Legend
3.34
Activations Density 0.001%