INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.07
2:0.08
3:0.07
4:0.08
5:0.08
6:0.08
7:0.08
8:0.08
9:0.08
10:0.09
11:0.08
Negative Logits
Sailor
-2.68
Reilly
-2.65
cancelled
-2.62
showc
-2.60
delinquent
-2.49
stranded
-2.49
Tolkien
-2.45
Stevenson
-2.44
fixture
-2.42
haunt
-2.38
POSITIVE LOGITS
architectures
2.83
�
2.82
�
2.79
�
2.73
praising
2.72
��
2.70
�
2.69
�
2.67
Deploy
2.62
�
2.59
Activations Density 0.000%
No Known Activations
This feature has no known activations.