INDEX
Explanations
questions starting with "Where" and followed by a verb
questions or prompts related to location or direction
New Auto-Interp
Negative Logits
lish
-0.64
âĹı
-0.63
teness
-0.63
cycle
-0.63
\\\\\\\\\\\\\\\\
-0.62
letters
-0.61
HCR
-0.60
bear
-0.60
Fas
-0.58
atform
-0.58
POSITIVE LOGITS
intersect
0.82
Interstitial
0.74
awaited
0.74
surrounded
0.70
geographically
0.70
invaded
0.69
boarded
0.68
abouts
0.67
REDACTED
0.67
SPONSORED
0.66
Activations Density 0.305%