INDEX
Explanations
expressions related to paying attention or focusing on details
New Auto-Interp
Negative Logits
ijn
-0.16
abyrin
-0.14
igator
-0.14
ç¡
-0.14
orian
-0.14
ocht
-0.14
DEALINGS
-0.14
anz
-0.14
pot
-0.14
IColor
-0.14
POSITIVE LOGITS
attention
0.38
homage
0.30
attention
0.28
tribute
0.28
Attention
0.28
Attention
0.26
close
0.25
visit
0.24
visits
0.24
_attention
0.23
Activations Density 0.014%