INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.08
4:0.08
5:0.08
6:0.07
7:0.08
8:0.07
9:0.10
10:0.08
11:0.07
Negative Logits
tips
-2.10
oufl
-1.99
umbn
-1.95
estyles
-1.93
Inv
-1.85
ulner
-1.85
anmar
-1.83
Pil
-1.82
arij
-1.81
helicop
-1.79
POSITIVE LOGITS
negro
1.98
VILLE
1.98
RON
1.66
rationality
1.63
correction
1.63
tyranny
1.52
Genie
1.52
injustice
1.49
irrational
1.48
eternal
1.47
Activations Density 0.000%
No Known Activations
This feature has no known activations.