INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.08
3:0.09
4:0.08
5:0.08
6:0.07
7:0.08
8:0.08
9:0.09
10:0.08
11:0.09
Negative Logits
pora
-1.60
olars
-1.56
unda
-1.55
ividual
-1.54
azeera
-1.53
aimon
-1.52
eele
-1.52
dstg
-1.50
Riy
-1.50
clair
-1.48
POSITIVE LOGITS
Bans
1.68
nodd
1.56
CVE
1.56
Termin
1.56
Maver
1.52
mant
1.49
terness
1.45
unker
1.42
tower
1.42
myster
1.41
Activations Density 0.000%
No Known Activations
This feature has no known activations.