INDEX
Explanations
references to specific locations and contexts
New Auto-Interp
Head Attr Weights
0:0.02
1:0.04
2:0.12
3:0.12
4:0.02
5:0.04
6:0.08
7:0.09
8:0.15
9:0.05
10:0.09
11:0.12
Negative Logits
azeera
-1.19
nces
-1.13
redit
-1.09
yss
-1.05
amins
-1.03
verb
-1.00
pointers
-1.00
ormons
-0.99
imil
-0.98
ept
-0.98
POSITIVE LOGITS
midst
1.17
vicinity
1.10
Basin
0.99
intest
0.96
nets
0.96
encl
0.96
behalf
0.96
Ku
0.93
FI
0.92
Wake
0.91
Activations Density 0.150%