INDEX
Explanations
nouns related to shops or places where goods are sold
New Auto-Interp
Negative Logits
uality
-0.19
ITU
-0.17
positions
-0.17
indi
-0.17
UAL
-0.17
itation
-0.16
ually
-0.16
ivated
-0.15
ousse
-0.15
ual
-0.15
POSITIVE LOGITS
ary
0.51
naire
0.41
aries
0.39
nement
0.36
naires
0.35
nel
0.33
ARY
0.31
er
0.29
ery
0.27
äre
0.27
Activations Density 0.097%