INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.09
2:0.08
3:0.08
4:0.08
5:0.08
6:0.09
7:0.08
8:0.07
9:0.07
10:0.07
11:0.08
Negative Logits
museums
-2.99
nonex
-2.85
mosquito
-2.77
mayors
-2.75
planes
-2.71
-2.69
architect
-2.67
bount
-2.67
livest
-2.62
conven
-2.61
POSITIVE LOGITS
JB
3.22
Lesbian
2.98
Barker
2.90
ORY
2.86
Bell
2.85
Punk
2.83
Hayes
2.80
Sisters
2.76
Shaman
2.75
HW
2.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.