INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.06
1:0.08
2:0.10
3:0.09
4:0.08
5:0.07
6:0.06
7:0.07
8:0.08
9:0.09
10:0.09
11:0.07
Negative Logits
velength
-2.14
shared
-1.90
liament
-1.77
overfl
-1.66
orge
-1.60
iP
-1.58
iq
-1.57
enos
-1.57
ricanes
-1.55
levision
-1.52
POSITIVE LOGITS
ゴン
1.87
WARE
1.74
Baal
1.69
whichever
1.68
playbook
1.57
══
1.52
YA
1.51
carnage
1.51
****
1.50
ledger
1.48
Activations Density 0.000%
No Known Activations
This feature has no known activations.