INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.10
2:0.08
3:0.08
4:0.07
5:0.06
6:0.09
7:0.07
8:0.09
9:0.06
10:0.09
11:0.07
Negative Logits
nodd
-1.97
AMI
-1.73
giveaways
-1.67
lett
-1.63
laun
-1.56
quir
-1.55
confir
-1.51
glers
-1.50
rosters
-1.45
pods
-1.44
POSITIVE LOGITS
Flavoring
2.28
女
2.20
=-
2.05
=/
1.84
divorced
1.83
abeth
1.82
equality
1.72
married
1.66
=
1.66
?",
1.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.