INDEX
Explanations
phrases indicating a specific area of concentration or attention
New Auto-Interp
Negative Logits
__("-0.19
aly
-0.17
ish
-0.17
ÙĤد
-0.15
åĵģ
-0.15
uv
-0.15
zim
-0.15
olis
-0.15
adows
-0.14
ìĬ¤íħĮ
-0.14
POSITIVE LOGITS
focus
0.16
point
0.16
focus
0.15
-focus
0.15
Tow
0.15
Focus
0.15
yles
0.14
attention
0.14
fix
0.14
revision
0.14
Activations Density 0.051%