INDEX
Explanations
prepositions and conjunctions
New Auto-Interp
Negative Logits
ſche
-0.77
purpoſe
-0.77
pleaſure
-0.72
Monfieur
-0.71
Perſ
-0.70
raiſ
-0.68
Theſe
-0.68
―――――
-0.67
Majefty
-0.67
ſtate
-0.65
POSITIVE LOGITS
в
1.43
у
0.92
ў
0.92
В
0.90
В
0.89
في
0.87
Trong
0.85
във
0.85
în
0.83
trong
0.82
Activations Density 0.028%