INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.07
3:0.09
4:0.09
5:0.07
6:0.08
7:0.08
8:0.08
9:0.08
10:0.09
11:0.09
Negative Logits
moot
-2.12
Cassidy
-1.80
outweigh
-1.61
Loving
-1.60
Buddhism
-1.60
Hurt
-1.59
Voting
-1.58
Abrams
-1.58
ihilation
-1.54
Renew
-1.51
POSITIVE LOGITS
bda
2.00
common
1.86
ECD
1.85
stretched
1.84
doi
1.83
udeb
1.78
Python
1.75
typically
1.75
usually
1.75
{"1.74
Activations Density 0.000%
No Known Activations
This feature has no known activations.