INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.07
3:0.09
4:0.08
5:0.09
6:0.07
7:0.08
8:0.08
9:0.08
10:0.09
11:0.08
Negative Logits
Flavoring
-2.18
Minor
-1.93
Major
-1.73
Students
-1.70
apego
-1.70
Fighting
-1.67
Chomsky
-1.63
Moral
-1.63
========
-1.60
Syri
-1.58
POSITIVE LOGITS
enegger
1.86
hoe
1.82
license
1.80
nan
1.76
Daddy
1.67
din
1.65
wine
1.63
vine
1.59
dam
1.58
web
1.53
Activations Density 0.000%
No Known Activations
This feature has no known activations.