INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.08
3:0.07
4:0.07
5:0.09
6:0.09
7:0.08
8:0.07
9:0.09
10:0.08
11:0.08
Negative Logits
Lever
-3.01
��
-2.79
Regulatory
-2.62
reviewer
-2.62
Wonderland
-2.61
llular
-2.57
Lean
-2.57
Regulation
-2.56
Lur
-2.53
Loop
-2.50
POSITIVE LOGITS
anan
2.88
Semitism
2.77
AIDS
2.76
―
2.74
aina
2.53
ysis
2.51
jen
2.50
HUD
2.50
Gaza
2.50
Sov
2.49
Activations Density 0.000%
No Known Activations
This feature has no known activations.