INDEX
Explanations
references to sources or origins in the text
New Auto-Interp
Negative Logits
ſta
-0.60
RTGC
-0.60
Efq
-0.56
ſever
-0.54
quæ
-0.54
sidemargin
-0.53
Hauptartikel
-0.53
ſtre
-0.52
ſol
-0.51
otomatig
-0.51
POSITIVE LOGITS
from
0.66
from
0.53
FROM
0.47
FROM
0.46
från
0.45
From
0.44
From
0.43
来自
0.42
dari
0.41
来自
0.41
Activations Density 0.086%