INDEX
Explanations
segments related to software updates and their functionalities
New Auto-Interp
Negative Logits
<bos>
-0.69
an
-0.67
in
-0.60
no
-0.60
a
-0.57
so
-0.56
e
-0.55
ur
-0.55
too
-0.55
for
-0.55
POSITIVE LOGITS
purpoſe
1.64
Houſe
1.58
houſe
1.57
ſtate
1.55
Majefty
1.54
ſche
1.50
itſelf
1.49
Anſ
1.48
Reſ
1.47
myſelf
1.46
Activations Density 0.056%