INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.09
3:0.07
4:0.09
5:0.08
6:0.08
7:0.08
8:0.08
9:0.09
10:0.06
11:0.08
Negative Logits
aunt
-1.83
Intercept
-1.82
Ammunition
-1.71
fame
-1.69
tossed
-1.68
Hur
-1.66
Fil
-1.66
struck
-1.62
silenced
-1.62
tones
-1.62
POSITIVE LOGITS
actively
1.99
independent
1.97
aeda
1.82
›
1.81
earch
1.80
ially
1.75
rogram
1.75
ective
1.74
clock
1.72
hematically
1.69
Activations Density 0.000%
No Known Activations
This feature has no known activations.