INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.08
4:0.08
5:0.08
6:0.07
7:0.07
8:0.09
9:0.10
10:0.08
11:0.07
Negative Logits
fences
-1.72
shame
-1.71
conventions
-1.69
alike
-1.60
rake
-1.58
hotter
-1.56
nowadays
-1.56
pressures
-1.54
mortals
-1.52
winters
-1.51
POSITIVE LOGITS
cknow
2.04
л
1.92
zsche
1.89
malink
1.89
裏�
1.74
Statement
1.70
rists
1.70
ault
1.69
auga
1.69
�
1.67
Activations Density 0.000%
No Known Activations
This feature has no known activations.