INDEX
Explanations
phrases that suggest caution or encouragement to take action
New Auto-Interp
Negative Logits
_HT
-0.16
antz
-0.16
иÑĢ
-0.14
اÙħÙĬ
-0.14
_partial
-0.13
-0.13
ioni
-0.13
éIJĺ
-0.13
tail
-0.13
uf
-0.13
POSITIVE LOGITS
hesitate
0.57
hesitation
0.43
hes
0.42
hes
0.36
reluctance
0.28
hesitant
0.25
afraid
0.22
shy
0.22
don
0.18
lico
0.17
Activations Density 0.064%