INDEX
Explanations
quotations and dialogue
New Auto-Interp
Negative Logits
"
-0.78
</h3>
-0.77
her
-0.68
da
-0.68
van
-0.67
</h6>
-0.67
t
-0.67
«
-0.67
et
-0.66
im
-0.64
POSITIVE LOGITS
quæ
1.17
Reſ
1.13
ſever
1.13
Perſ
1.12
ſtate
1.12
Anſ
1.10
reaſon
1.10
itſelf
1.10
myſelf
1.09
Diſ
1.09
Activations Density 0.137%