INDEX
Explanations
the word "of" followed by possessive pronouns or determiners
phrases referencing the concept of size or proportion related to various subjects
New Auto-Interp
Negative Logits
)].
-0.76
aire
-0.75
Enlarge
-0.73
KK
-0.70
far
-0.69
umerable
-0.69
claim
-0.69
orie
-0.68
hra
-0.67
hran
-0.65
POSITIVE LOGITS
veins
0.73
our
0.72
each
0.71
planets
0.71
these
0.70
Nanto
0.69
your
0.69
the
0.69
those
0.67
atoms
0.67
Activations Density 0.164%