INDEX
Explanations
words related to physical locations or infrastructure, specifically pedestrian areas
the presence of the substring "estrian" in various contexts
New Auto-Interp
Negative Logits
impe
-0.59
shock
-0.56
Bain
-0.55
going
-0.54
kefeller
-0.54
Yates
-0.53
faculties
-0.53
clusive
-0.53
plates
-0.53
employed
-0.53
POSITIVE LOGITS
rian
1.24
ruct
1.21
reet
1.19
imate
1.17
rians
1.16
oppers
1.15
ream
1.15
hetics
1.12
rations
1.08
imates
1.06
Activations Density 0.050%