INDEX
Explanations
mentions of the word "local" appearing with high activations
mentions of "local" in various contexts
New Auto-Interp
Negative Logits
xual
-1.16
haar
-0.90
issance
-0.86
hra
-0.83
SHIP
-0.82
uberty
-0.81
ppelin
-0.80
ACTED
-0.78
lihood
-0.77
lda
-0.76
POSITIVE LOGITS
authorities
0.85
residents
0.82
local
0.81
grocer
0.80
elders
0.79
neighbors
0.78
©¶æ
0.78
opio
0.78
corrid
0.76
neighbours
0.75
Activations Density 0.025%