INDEX
Explanations
phrases that refer to drawing attention or indicating focus
New Auto-Interp
Negative Logits
TY
-0.17
ookie
-0.17
uckle
-0.15
pios
-0.14
caracter
-0.14
rant
-0.14
Rosenstein
-0.14
iasi
-0.14
shaw
-0.13
etag
-0.13
POSITIVE LOGITS
attention
0.70
attention
0.57
Attention
0.56
Attention
0.52
внимание
0.45
_attention
0.41
atención
0.38
注æĦı
0.37
вним
0.36
notice
0.35
Activations Density 0.065%