INDEX
Explanations
phrases related to policies and regulations within institutional frameworks
New Auto-Interp
Negative Logits
benches
-0.15
eming
-0.15
ilty
-0.14
Manson
-0.14
abin
-0.14
æĪIJ
-0.14
bench
-0.14
nevid
-0.14
елÑĸ
-0.14
Mb
-0.14
POSITIVE LOGITS
ikut
0.18
ibel
0.17
adoo
0.16
ikh
0.14
vik
0.14
coln
0.14
eyse
0.14
milan
0.14
ikel
0.14
æ¥
0.13
Activations Density 0.218%