INDEX
Explanations
references to historical timelines and achievements
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.03
3:0.15
4:0.12
5:0.09
6:0.12
7:0.04
8:0.14
9:0.07
10:0.01
11:0.03
Negative Logits
Omaha
-1.45
wartime
-1.43
ppa
-1.43
-1.40
elight
-1.39
nationally
-1.37
victory
-1.37
Thumbnail
-1.35
workplaces
-1.35
overnight
-1.35
POSITIVE LOGITS
Reviewer
1.97
Downloadha
1.96
evidence
1.77
DragonMagazine
1.73
ibaba
1.70
yrics
1.64
etheless
1.62
avg
1.61
?)
1.60
?),
1.59
Activations Density 0.350%