INDEX
Explanations
references to specific organizations or entities
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.09
3:0.06
4:0.10
5:0.08
6:0.10
7:0.07
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
Sponsor
-3.26
Samar
-3.16
Anthem
-3.04
Vigil
-2.85
Grant
-2.83
Title
-2.82
Grant
-2.77
Miranda
-2.73
Nak
-2.72
Bet
-2.69
POSITIVE LOGITS
ngth
3.00
models
2.98
ould
2.92
renheit
2.88
skeletons
2.87
peed
2.77
centimeters
2.68
macros
2.65
achev
2.65
Stats
2.63
Activations Density 0.000%