INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.08
3:0.07
4:0.09
5:0.07
6:0.09
7:0.08
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
pez
-1.91
eger
-1.71
capt
-1.62
found
-1.53
rys
-1.52
clip
-1.51
gotten
-1.50
ppings
-1.48
burning
-1.45
Ort
-1.44
POSITIVE LOGITS
Dialogue
1.81
)=(
1.80
zona
1.69
akura
1.68
conformity
1.68
Decision
1.66
Relations
1.62
Flavoring
1.60
behavi
1.59
Commentary
1.59
Activations Density 0.000%
No Known Activations
This feature has no known activations.