INDEX
Explanations
the word "full", and variations of the word "change"
full or change
New Auto-Interp
Negative Logits
caufe
-0.97
ſeveral
-0.93
Theſe
-0.92
cauſe
-0.92
Jefus
-0.91
ſhall
-0.89
preſent
-0.89
juſ
-0.88
ſtand
-0.88
itſelf
-0.88
POSITIVE LOGITS
<bos>
0.64
0.63
and
0.61
↵
0.60
IsContent
0.58
,
0.56
↵↵
0.56
hoeddwyd
0.54
I
0.53
for
0.53
Activations Density 0.292%