INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.09
3:0.08
4:0.07
5:0.09
6:0.07
7:0.09
8:0.08
9:0.08
10:0.07
11:0.07
Negative Logits
artifacts
-1.86
20439
-1.69
Provided
-1.66
gren
-1.55
qqa
-1.54
Shogun
-1.49
indo
-1.49
amo
-1.47
ruction
-1.45
artisan
-1.45
POSITIVE LOGITS
DonaldTrump
1.80
GBT
1.67
Carey
1.54
.",
1.53
istar
1.52
atalie
1.51
WATCH
1.50
Osw
1.46
`.
1.45
neurological
1.44
Activations Density 0.000%
No Known Activations
This feature has no known activations.