INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.09
4:0.08
5:0.08
6:0.06
7:0.09
8:0.10
9:0.08
10:0.09
11:0.07
Negative Logits
inav
-2.13
ovan
-1.85
items
-1.78
Jews
-1.69
oslov
-1.66
atin
-1.66
ingo
-1.66
ukong
-1.65
requ
-1.64
agen
-1.64
POSITIVE LOGITS
whistlebl
1.89
dips
1.79
emouth
1.77
premie
1.76
tumble
1.66
lows
1.66
airs
1.65
helm
1.59
recl
1.59
examines
1.58
Activations Density 0.000%
No Known Activations
This feature has no known activations.