INDEX
Explanations
references to locations, specifically cities in the UK
New Auto-Interp
Negative Logits
ego
-0.14
inke
-0.13
agogue
-0.13
.basicConfig
-0.13
atak
-0.13
assis
-0.13
essaging
-0.13
znam
-0.13
informant
-0.13
Dim
-0.13
POSITIVE LOGITS
vale
0.17
cly
0.16
erty
0.15
shire
0.15
czy
0.15
erce
0.15
ignon
0.14
enler
0.14
ropy
0.14
ä¿
0.14
Activations Density 0.033%