INDEX
Explanations
phrases that denote attention and concentration
New Auto-Interp
Negative Logits
Administrativna
-0.60
kasarigan
-0.56
ControllerBase
-0.52
KommentareTeilen
-0.51
EIF
-0.49
CURIAM
-0.49
RTSN
-0.47
виправивши
-0.47
LookAnd
-0.47
PYX
-0.46
POSITIVE LOGITS
attention
0.82
Attention
0.73
attention
0.66
Attention
0.66
внимания
0.64
efforts
0.62
attentions
0.60
atención
0.59
effort
0.59
ATTENTION
0.56
Activations Density 0.128%