INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.09
4:0.07
5:0.08
6:0.08
7:0.08
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
enthusi
-2.20
eatures
-2.20
fools
-2.11
spons
-1.90
adolesc
-1.85
reluct
-1.81
murd
-1.77
describ
-1.75
fung
-1.75
nodd
-1.75
POSITIVE LOGITS
location
2.12
Unknown
1.94
ubi
1.83
Activ
1.80
tained
1.78
�
1.75
��
1.73
Completed
1.72
�
1.70
Change
1.67
Activations Density 0.000%
No Known Activations
This feature has no known activations.