INDEX
Explanations
specific nouns and phrases related to governance and social institutions
New Auto-Interp
Negative Logits
instead
-0.19
instead
-0.19
вмеÑģÑĤ
-0.16
irates
-0.16
irá
-0.16
iri
-0.15
fid
-0.15
åħ¼
-0.14
eral
-0.14
oreach
-0.14
POSITIVE LOGITS
Ùħباش
0.19
/non
0.16
anymore
0.15
upo
0.15
å±Ĭ
0.15
ä½Ĩ
0.14
nero
0.14
alone
0.14
altogether
0.14
arella
0.14
Activations Density 0.153%