INDEX
Explanations
phrases related to exclusivity or singularity
instances of the word "only" in various contexts
New Auto-Interp
Negative Logits
insula
-0.84
arted
-0.64
idon
-0.62
Massive
-0.61
heterogeneity
-0.58
went
-0.58
fragmentation
-0.58
finder
-0.57
Mages
-0.56
--------------------------------------------------------
-0.55
POSITIVE LOGITS
marginally
0.86
incidentally
0.79
only
0.70
omit
0.65
ļéĨĴ
0.63
barely
0.62
rama
0.61
onso
0.61
ONLY
0.61
sparing
0.60
Activations Density 0.070%