INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.07
4:0.08
5:0.08
6:0.08
7:0.08
8:0.08
9:0.06
10:0.08
11:0.09
Negative Logits
ither
-1.82
ourning
-1.74
entin
-1.74
izons
-1.66
oba
-1.60
xes
-1.52
olphins
-1.51
Nano
-1.50
ichever
-1.49
Fiscal
-1.49
POSITIVE LOGITS
��
1.94
rul
1.80
confir
1.69
bug
1.65
disapp
1.65
advoc
1.65
agre
1.63
"""
1.63
implication
1.63
hoax
1.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.