INDEX
Explanations
attends to numerical tokens from other similar numerical tokens listed earlier in the sequence
New Auto-Interp
Head Attr Weights
0:0.13
1:0.14
2:0.14
3:0.12
4:0.12
5:0.03
6:0.12
7:0.17
Negative Logits
Rom
-0.26
CCN
-0.24
Reg
-0.24
слава
-0.23
Martens
-0.23
obraz
-0.23
<h1>
-0.23
ску
-0.23
rég
-0.22
Revolution
-0.21
POSITIVE LOGITS
ſelf
0.48
متعلقه
0.47
فريبيس
0.46
LookAnd
0.45
resourceCulture
0.45
myſelf
0.43
himſelf
0.43
Искәрмәләр
0.42
betweenstory
0.41
IUrlHelper
0.41
Activations Density 0.166%