INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.09
4:0.08
5:0.08
6:0.07
7:0.08
8:0.07
9:0.08
10:0.08
11:0.07
Negative Logits
doct
-2.94
scrut
-2.75
pol
-2.66
retri
-2.59
piv
-2.57
bog
-2.51
�
-2.50
Burr
-2.47
tro
-2.46
idav
-2.44
POSITIVE LOGITS
Glass
2.92
wine
2.86
Eat
2.62
tones
2.57
hots
2.53
Gaza
2.53
ayne
2.48
Food
2.47
Han
2.45
Hong
2.45
Activations Density 0.000%
No Known Activations
This feature has no known activations.