INDEX
Explanations
phrases indicating probability or likelihood
phrases expressing likelihood or probability
New Auto-Interp
Negative Logits
ciating
-0.74
arantine
-0.73
perty
-0.72
ilan
-0.70
GBT
-0.70
elight
-0.70
nai
-0.69
Ô
-0.68
rency
-0.68
gall
-0.68
POSITIVE LOGITS
that
1.20
there
0.86
they
0.85
THAT
0.84
that
0.75
whoever
0.74
we
0.72
none
0.70
many
0.68
she
0.68
Activations Density 0.191%