INDEX
Explanations
punctuation and prepositions indicating locations and times in the document
New Auto-Interp
Negative Logits
ipop
-0.17
linger
-0.17
hill
-0.16
struk
-0.15
arker
-0.15
insky
-0.15
kke
-0.14
anas
-0.14
iaux
-0.14
anes
-0.14
POSITIVE LOGITS
raf
0.16
odd
0.16
££
0.16
mazon
0.16
componentDid
0.15
igkeit
0.14
.uf
0.14
lew
0.14
ãĥªãĤ¢
0.14
uw
0.13
Activations Density 0.003%