INDEX
Explanations
discourse markers and transition phrases in discussions
New Auto-Interp
Negative Logits
ſelf
-1.01
myſelf
-0.97
Efq
-0.95
ſelves
-0.93
felves
-0.90
poffe
-0.88
itſelf
-0.87
Monfieur
-0.87
uſed
-0.85
felf
-0.83
POSITIVE LOGITS
it
0.88
he
0.66
It
0.59
there
0.55
they
0.55
It
0.50
I
0.50
their
0.49
the
0.49
its
0.49
Activations Density 0.032%