INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.10
1:0.05
2:0.09
3:0.08
4:0.06
5:0.09
6:0.08
7:0.07
8:0.09
9:0.08
10:0.08
11:0.08
Negative Logits
BILITIES
-2.08
htaking
-2.00
Mub
-1.80
ulations
-1.77
cliffe
-1.67
hower
-1.66
opus
-1.62
ulf
-1.60
qi
-1.57
bern
-1.57
POSITIVE LOGITS
BSD
1.97
Nation
1.81
tarian
1.78
Club
1.78
Party
1.72
Leilan
1.71
GAN
1.70
anarch
1.65
offensive
1.63
DEP
1.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.