INDEX
Explanations
conjunctions and their repeated usage in sentences
New Auto-Interp
Negative Logits
ſelf
-1.00
themſelves
-0.83
himſelf
-0.82
itſelf
-0.77
*/),
-0.76
lenker
-0.75
Demikian
-0.75
}';
-0.74
―――――
-0.74
myſelf
-0.73
POSITIVE LOGITS
I
0.83
you
0.82
everybody
0.77
they
0.74
we
0.74
2
0.71
it
0.71
everyone
0.67
And
0.65
1
0.64
Activations Density 0.303%