INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.09
5:0.07
6:0.09
7:0.08
8:0.06
9:0.09
10:0.08
11:0.08
Negative Logits
ividual
-1.79
levant
-1.71
-1.71
Marginal
-1.67
etc
-1.59
Posted
-1.56
inki
-1.52
abiding
-1.52
aneers
-1.50
Flavoring
-1.50
POSITIVE LOGITS
FU
1.87
666
1.65
Gh
1.63
ilk
1.57
oos
1.45
mk
1.44
GR
1.43
Braz
1.41
Gan
1.41
congratulations
1.41
Activations Density 0.000%
No Known Activations
This feature has no known activations.