INDEX
Explanations
instances where attention is being focused or drawn to a particular subject
references to attention and its variations
New Auto-Interp
Negative Logits
Tale
-0.74
Rebell
-0.65
nia
-0.63
Yugoslavia
-0.63
mos
-0.62
tein
-0.61
Dani
-0.61
lins
-0.60
iche
-0.60
Vers
-0.59
POSITIVE LOGITS
spans
0.96
span
0.95
attention
0.85
orial
0.84
grab
0.82
largeDownload
0.81
seeker
0.81
estinal
0.80
seekers
0.80
grabbing
0.77
Activations Density 0.053%