INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.08
3:0.09
4:0.09
5:0.08
6:0.08
7:0.09
8:0.07
9:0.08
10:0.07
11:0.08
Negative Logits
uten
-1.67
TP
-1.64
ы
-1.64
Hath
-1.61
regulars
-1.58
Ages
-1.58
Reeves
-1.55
Assassins
-1.54
ultras
-1.51
Bey
-1.49
POSITIVE LOGITS
gradient
1.60
pection
1.58
itive
1.57
ommod
1.51
netflix
1.47
acerb
1.47
itarian
1.46
mand
1.46
itivity
1.46
(-
1.45
Activations Density 0.000%
No Known Activations
This feature has no known activations.