INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.04
2:0.10
3:0.09
4:0.10
5:0.08
6:0.09
7:0.07
8:0.07
9:0.06
10:0.08
11:0.08
Negative Logits
aples
-2.16
anwhile
-1.94
undown
-1.90
aturday
-1.86
Goods
-1.85
ovember
-1.81
arta
-1.80
aband
-1.77
uria
-1.75
artifacts
-1.69
POSITIVE LOGITS
persona
1.51
Brit
1.48
opposite
1.46
fart
1.45
faire
1.44
bloom
1.43
ideal
1.43
adapt
1.42
anim
1.38
devoted
1.37
Activations Density 0.000%
No Known Activations
This feature has no known activations.