INDEX
Explanations
instances of the word "or" with high activation values
the word "or" and its usage in various contexts
New Auto-Interp
Negative Logits
mostly
-0.79
then
-0.65
now
-0.65
tackle
-0.64
probably
-0.59
NOW
-0.58
Lots
-0.58
Probably
-0.57
Almost
-0.56
Here
-0.56
POSITIVE LOGITS
anything
1.54
any
1.24
anywhere
1.18
anybody
1.13
slightest
1.12
anyone
1.11
even
1.10
chard
1.10
anymore
1.09
whatsoever
1.09
Activations Density 0.093%