INDEX
Explanations
references to objects and locations
occurrences of the word "in" within sentences
New Auto-Interp
Negative Logits
incumb
-0.72
idays
-0.65
llor
-0.61
NOW
-0.59
illian
-0.59
izophren
-0.57
emo
-0.57
embold
-0.57
informed
-0.56
phans
-0.56
POSITIVE LOGITS
lieu
1.32
animate
1.22
clusions
1.18
accordance
1.11
ked
1.11
conjunction
1.10
between
1.08
situ
1.08
escap
1.04
front
1.02
Activations Density 0.288%