INDEX
Explanations
concepts related to institutions, particularly those associated with guidance, organization, and structure
New Auto-Interp
Negative Logits
ird
-0.17
na
-0.16
iras
-0.16
arden
-0.15
outs
-0.15
liv
-0.15
ìĦ¼
-0.15
Gu
-0.15
ongan
-0.14
اÙĦÙĦÙĩ
-0.14
POSITIVE LOGITS
EntryPoint
0.16
zam
0.15
geh
0.14
ÑĤоÑĩ
0.14
tooltip
0.14
.until
0.14
reeze
0.14
umlu
0.13
nightmares
0.13
iem
0.13
Activations Density 0.280%