INDEX
Explanations
the word "shy."
instances of the word "shy."
New Auto-Interp
Negative Logits
女
-0.93
è¦ļéĨĴ
-0.84
ACTED
-0.78
urgy
-0.73
代
-0.71
RAW
-0.71
akeru
-0.69
ghazi
-0.67
quality
-0.67
̶
-0.67
POSITIVE LOGITS
ness
1.06
ety
1.03
shy
0.90
vana
0.89
uously
0.82
lish
0.80
eteenth
0.76
sters
0.75
nesses
0.74
rets
0.74
Activations Density 0.009%