INDEX
Explanations
words related to gaining or attracting attention and recognition
New Auto-Interp
Negative Logits
adaki
-0.15
šek
-0.15
SSERT
-0.14
oppable
-0.14
-coordinate
-0.14
HandlerContext
-0.14
astos
-0.14
Richardson
-0.14
uilder
-0.14
iom
-0.14
POSITIVE LOGITS
attention
0.39
attention
0.32
Attention
0.28
Attention
0.27
headlines
0.26
notice
0.25
interest
0.24
_attention
0.22
atención
0.22
buzz
0.21
Activations Density 0.091%