INDEX
Explanations
mentions of specific locations, with a focus on New York City
occurrences of the preposition "in."
New Auto-Interp
Negative Logits
convol
-0.75
ername
-0.69
FTWARE
-0.62
needles
-0.60
racket
-0.59
rious
-0.58
infer
-0.58
weather
-0.58
ingred
-0.58
spoil
-0.58
POSITIVE LOGITS
particular
1.43
nutshell
1.13
verts
1.10
ked
1.06
vert
0.98
version
0.95
fact
0.92
clusions
0.89
bound
0.89
pires
0.87
Activations Density 0.265%