INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.08
3:0.07
4:0.09
5:0.09
6:0.08
7:0.07
8:0.08
9:0.06
10:0.09
11:0.08
Negative Logits
numbering
-1.84
TextColor
-1.66
numbered
-1.55
iform
-1.54
numer
-1.51
strikingly
-1.49
inhib
-1.48
brightest
-1.47
differed
-1.46
noteworthy
-1.45
POSITIVE LOGITS
milo
1.79
yourself
1.65
Yourself
1.61
package
1.59
podcast
1.58
Goes
1.57
morrow
1.56
answ
1.52
Explain
1.51
voice
1.51
Activations Density 0.000%
No Known Activations
This feature has no known activations.