INDEX
Explanations
phrases indicating relationships or parts of a whole
New Auto-Interp
Negative Logits
[…]
-0.51
väli
-0.49
ogóle
-0.49
IBar
-0.47
[...]
-0.46
ACTUALLY
-0.46
kiin
-0.45
AnchorStyles
-0.44
interestingly
-0.44
adays
-0.44
POSITIVE LOGITS
her
1.30
his
1.16
your
1.07
my
1.02
our
0.95
him
0.88
henne
0.83
their
0.83
you
0.80
me
0.76
Activations Density 1.815%