INDEX
Explanations
references to current events or updates in news articles
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.06
3:0.08
4:0.25
5:0.04
6:0.21
7:0.04
8:0.04
9:0.04
10:0.09
11:0.05
Negative Logits
orem
-1.40
efer
-1.39
upon
-1.39
Malk
-1.36
FTA
-1.32
alky
-1.30
District
-1.29
Erie
-1.28
osta
-1.27
Ag
-1.24
POSITIVE LOGITS
ONSORED
1.68
Topic
1.51
Compatibility
1.46
VERTISEMENT
1.46
catentry
1.45
iHUD
1.41
��
1.41
=-=-=-=-
1.39
ims
1.38
PDATE
1.37
Activations Density 0.000%