INDEX
Explanations
instances where attention is being directed or should be directed towards something
instances of paying attention in various contexts
New Auto-Interp
Negative Logits
venge
-0.76
geries
-0.71
rie
-0.70
byn
-0.68
iHUD
-0.68
joining
-0.65
anus
-0.64
rafted
-0.63
headers
-0.62
cluding
-0.62
POSITIVE LOGITS
attent
0.86
é¾įå¥ij士
0.84
ibly
0.73
fulness
0.72
geist
0.68
cues
0.67
IBLE
0.65
attentive
0.64
Listen
0.63
quizz
0.63
Activations Density 0.020%