INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.09
3:0.08
4:0.09
5:0.08
6:0.09
7:0.08
8:0.08
9:0.07
10:0.09
11:0.07
Negative Logits
gradation
-1.84
usable
-1.69
Reviewer
-1.65
ingu
-1.63
ensibly
-1.59
onomic
-1.55
ongyang
-1.52
formance
-1.52
possession
-1.51
ioned
-1.50
POSITIVE LOGITS
Nap
1.61
icz
1.56
Mig
1.55
eni
1.52
talking
1.52
SPD
1.50
Manz
1.50
mosp
1.50
laughs
1.50
Humph
1.45
Activations Density 0.000%
No Known Activations
This feature has no known activations.