INDEX
Explanations
phrases related to thinking, analyzing, or considering various concepts or ideas
phrases expressing difficulty in recalling or thinking of examples or movements
New Auto-Interp
Negative Logits
yon
-0.77
pione
-0.76
encia
-0.73
yip
-0.72
shift
-0.66
teasp
-0.66
hift
-0.66
ju
-0.65
nels
-0.65
ysc
-0.63
POSITIVE LOGITS
anyone
1.02
instances
1.01
anywhere
1.01
examples
0.97
anybody
0.95
precedent
0.94
instance
0.94
any
0.91
anything
0.90
comparable
0.90
Activations Density 0.184%