INDEX
Explanations
proper nouns, particularly names of people or places, combined with some additional text
occurrences of the word "lost" and related lexical forms
New Auto-Interp
Negative Logits
icester
-0.75
asuring
-0.72
hm
-0.72
lahoma
-0.69
dem
-0.66
gers
-0.64
auga
-0.64
Dug
-0.64
acity
-0.62
Schne
-0.62
POSITIVE LOGITS
LY
1.26
STON
1.19
OST
1.16
ULAR
1.15
IAL
1.13
URN
1.13
NESS
1.12
UR
1.11
ESH
1.10
AGE
1.10
Activations Density 0.022%