INDEX
Explanations
common pronouns and determiners
New Auto-Interp
Negative Logits
الحره
-0.87
ViewImports
-0.86
ształ
-0.78
Whigs
-0.73
Cadiz
-0.71
GoogleFonts
-0.71
BibitemShut
-0.69
Betyg
-0.69
Huguen
-0.69
виправивши
-0.69
POSITIVE LOGITS
&___
0.66
del
0.60
D
0.57
min
0.57
C
0.56
min
0.52
Ho
0.52
bas
0.51
U
0.50
E
0.50
Activations Density 0.464%