INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.09
3:0.08
4:0.08
5:0.08
6:0.08
7:0.09
8:0.07
9:0.08
10:0.07
11:0.08
Negative Logits
evil
-1.62
eters
-1.61
paralysis
-1.54
ipal
-1.50
Privacy
-1.47
DEV
-1.45
cipled
-1.45
punish
-1.44
animous
-1.44
ocket
-1.44
POSITIVE LOGITS
Metall
1.83
Scotch
1.58
DragonMagazine
1.57
veter
1.56
isot
1.56
PubMed
1.49
cyl
1.48
Tud
1.47
Ori
1.45
opausal
1.44
Activations Density 0.000%
No Known Activations
This feature has no known activations.