INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.08
3:0.08
4:0.09
5:0.09
6:0.08
7:0.08
8:0.07
9:0.07
10:0.08
11:0.08
Negative Logits
pload
-1.75
��
-1.73
iencies
-1.71
geries
-1.69
apolis
-1.66
¶
-1.64
ivable
-1.57
Yug
-1.56
Neb
-1.56
ensable
-1.56
POSITIVE LOGITS
spokesperson
1.71
ouf
1.68
heim
1.67
spokesman
1.64
sam
1.60
mother
1.58
icist
1.56
personality
1.50
heimer
1.50
ALSE
1.50
Activations Density 0.000%
No Known Activations
This feature has no known activations.