INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.05
2:0.08
3:0.09
4:0.09
5:0.08
6:0.08
7:0.07
8:0.09
9:0.07
10:0.08
11:0.08
Negative Logits
jew
-1.77
Orn
-1.75
ivist
-1.66
oeuv
-1.59
inav
-1.54
cloth
-1.51
Dance
-1.50
utch
-1.49
Writ
-1.49
ovan
-1.46
POSITIVE LOGITS
tremend
1.78
shitty
1.67
ITS
1.61
enthusi
1.60
internally
1.56
favorable
1.56
stabilized
1.55
coordinates
1.55
favourable
1.54
stability
1.52
Activations Density 0.000%
No Known Activations
This feature has no known activations.