INDEX
Explanations
dialogue and expressions of desire or intention
New Auto-Interp
Negative Logits
're
-0.18
лаÑģ
-0.17
’re
-0.16
abeth
-0.15
currently
-0.14
zzo
-0.14
ding
-0.14
nid
-0.14
ambda
-0.14
uckles
-0.14
POSITIVE LOGITS
shall
0.25
fancy
0.22
shall
0.21
appreh
0.20
ancy
0.19
protest
0.19
sha
0.19
SHALL
0.18
repeat
0.17
venture
0.17
Activations Density 0.124%