INDEX
Explanations
the word "of" in a variety of contexts
phrases emphasizing the quantity or prevalence of something
New Auto-Interp
Negative Logits
ples
-0.75
ettel
-0.75
uction
-0.68
chops
-0.61
rematch
-0.58
mut
-0.56
byn
-0.56
gee
-0.56
disposed
-0.56
dinand
-0.55
POSITIVE LOGITS
us
1.08
them
0.81
humankind
0.79
what
0.77
those
0.77
mankind
0.76
these
0.75
Europe
0.74
course
0.72
the
0.69
Activations Density 0.053%