INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.07
3:0.09
4:0.09
5:0.08
6:0.10
7:0.09
8:0.07
9:0.07
10:0.07
11:0.08
Negative Logits
う
-2.33
lig
-2.31
ettes
-2.19
pleas
-2.15
unconditional
-2.12
feminine
-2.08
jo
-2.07
padd
-2.06
tro
-2.03
closure
-2.02
POSITIVE LOGITS
adan
2.72
ultan
2.70
renheit
2.61
alore
2.58
IDA
2.54
kef
2.50
arcity
2.49
ushima
2.48
coni
2.44
paio
2.43
Activations Density 0.000%
No Known Activations
This feature has no known activations.