INDEX
Explanations
phrases indicating restrictions or limitations
New Auto-Interp
Negative Logits
neau
-0.18
kar
-0.14
isten
-0.13
prest
-0.13
hl
-0.13
lcm
-0.13
è´¹
-0.13
ona
-0.13
HEET
-0.13
adow
-0.13
POSITIVE LOGITS
.Restr
0.18
xa
0.17
ities
0.17
ìĤ¬íķŃ
0.16
odore
0.16
Isles
0.16
spb
0.15
pollo
0.15
ìĤ¬íķŃ
0.15
LIMIT
0.15
Activations Density 0.074%