INDEX
Explanations
the presence of legal terminology and related concepts
New Auto-Interp
Negative Logits
później
-0.47
(
-0.45
my
-0.41
smiles
-0.41
a
-0.39
an
-0.38
DOCTYPE
-0.38
'
-0.37
↵
-0.35
м
-0.35
POSITIVE LOGITS
tranſ
0.85
Diſ
0.83
ſelf
0.82
pleaſure
0.82
ſen
0.81
purpoſe
0.79
paſſ
0.78
ſeveral
0.76
houſe
0.76
Reſ
0.76
Activations Density 1.612%