INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.10
3:0.07
4:0.07
5:0.08
6:0.09
7:0.08
8:0.09
9:0.08
10:0.07
11:0.08
Negative Logits
constitu
-2.04
instituted
-1.73
ratified
-1.71
implemented
-1.71
reinstated
-1.64
promul
-1.63
milo
-1.62
fert
-1.61
mathemat
-1.59
instr
-1.59
POSITIVE LOGITS
skip
2.09
herry
1.70
Column
1.70
column
1.67
mx
1.63
gallery
1.63
ogle
1.55
adata
1.54
ews
1.53
previews
1.50
Activations Density 0.000%
No Known Activations
This feature has no known activations.