INDEX
Explanations
phrases related to social and economic frameworks
New Auto-Interp
Negative Logits
ilton
-0.16
ãİ
-0.15
/her
-0.15
rice
-0.14
Č
-0.14
OKIE
-0.14
ãĥ«ãĥķ
-0.14
eneral
-0.14
imal
-0.13
keit
-0.13
POSITIVE LOGITS
.me
0.18
ech
0.15
wi
0.15
neph
0.14
aga
0.14
лиÑĨ
0.14
inski
0.14
rotch
0.14
fts
0.14
azon
0.14
Activations Density 0.817%