INDEX
Explanations
conjunctions and linking phrases that indicate a connection or continuation of ideas
New Auto-Interp
Negative Logits
ſelf
-0.71
adecimal
-0.69
msglen
-0.66
ʺ
-0.65
Gruß
-0.64
tems
-0.63
felf
-0.63
neceff
-0.62
(§
-0.61
fubject
-0.61
POSITIVE LOGITS
we
0.71
I
0.67
something
0.66
you
0.64
everybody
0.64
it
0.64
everyone
0.63
And
0.62
if
0.60
there
0.59
Activations Density 0.400%