INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.08
3:0.08
4:0.07
5:0.09
6:0.08
7:0.07
8:0.07
9:0.08
10:0.10
11:0.08
Negative Logits
Breitbart
-1.60
4090
-1.57
Tablet
-1.51
nesday
-1.49
Archive
-1.49
Byte
-1.48
rase
-1.46
itbart
-1.46
Sinclair
-1.46
encer
-1.44
POSITIVE LOGITS
seiz
1.70
burgh
1.66
Sov
1.63
behav
1.62
veter
1.60
Reviewer
1.60
fortun
1.59
nodd
1.57
hous
1.55
Galile
1.53
Activations Density 0.000%
No Known Activations
This feature has no known activations.