INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
mic
-0.60
ber
-0.58
mal
-0.58
proceeding
-0.57
dwelling
-0.55
pol
-0.54
Dest
-0.54
_-
-0.54
im
-0.54
maj
-0.53
POSITIVE LOGITS
ļéĨĴ
0.77
TextColor
0.72
xtap
0.66
issan
0.64
eteenth
0.64
perature
0.63
Polo
0.63
udder
0.63
cius
0.62
Romeo
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.