INDEX
Explanations
documents and articles related to various news, events, and topics
mentions of news articles, reports, and sources related to current events
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.08
3:0.06
4:0.03
5:0.05
6:0.08
7:0.02
8:0.36
9:0.08
10:0.06
11:0.02
Negative Logits
peat
-1.62
apologize
-1.55
salute
-1.49
forgotten
-1.47
blooded
-1.47
unnoticed
-1.43
apologise
-1.43
untouched
-1.42
closet
-1.42
ardless
-1.39
POSITIVE LOGITS
glers
1.82
Vance
1.60
Beir
1.45
Annotations
1.44
miah
1.43
HIP
1.41
ecd
1.36
Caldwell
1.36
Grape
1.35
alion
1.32
Activations Density 0.109%