INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.06
1:0.07
2:0.09
3:0.08
4:0.08
5:0.07
6:0.09
7:0.10
8:0.07
9:0.06
10:0.09
11:0.08
Negative Logits
nt
-1.50
hod
-1.41
*:
-1.35
tion
-1.35
learn
-1.34
intel
-1.29
Invalid
-1.28
respons
-1.27
:=
-1.23
uther
-1.22
POSITIVE LOGITS
enegger
1.56
erey
1.54
Tsu
1.34
Ambro
1.34
士
1.32
mares
1.30
Reilly
1.27
Vaugh
1.26
Canaver
1.23
vre
1.21
Activations Density 0.000%
No Known Activations
This feature has no known activations.