INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.07
2:0.08
3:0.08
4:0.07
5:0.08
6:0.09
7:0.07
8:0.09
9:0.08
10:0.07
11:0.07
Negative Logits
Per
-1.66
Gam
-1.56
Singh
-1.55
Singer
-1.51
Theft
-1.51
owns
-1.50
mt
-1.48
Mit
-1.47
Cond
-1.44
Desire
-1.44
POSITIVE LOGITS
tips
1.99
odox
1.94
sembly
1.91
heast
1.90
apult
1.90
geons
1.89
leys
1.86
vote
1.85
gently
1.80
Flavoring
1.78
Activations Density 0.000%
No Known Activations
This feature has no known activations.