INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.09
3:0.10
4:0.08
5:0.08
6:0.07
7:0.07
8:0.08
9:0.08
10:0.08
11:0.08
Negative Logits
tf
-1.66
ventions
-1.64
views
-1.64
thesis
-1.63
Plans
-1.63
tml
-1.58
disciplinary
-1.58
���
-1.57
thood
-1.56
conserv
-1.55
POSITIVE LOGITS
oother
1.89
stranger
1.81
MSN
1.66
skip
1.62
fou
1.55
ocobo
1.55
idium
1.52
Fenrir
1.50
Sawyer
1.50
byss
1.49
Activations Density 0.000%
No Known Activations
This feature has no known activations.