INDEX
Explanations
words related to restrictions or limitations
New Auto-Interp
Negative Logits
rei
-0.15
keit
-0.14
oli
-0.14
iry
-0.14
estre
-0.13
391
-0.13
od
-0.13
_mgmt
-0.13
fried
-0.13
reau
-0.13
POSITIVE LOGITS
by
0.69
oleh
0.56
bợi
0.51
تÙĪØ³Ø·
0.47
by
0.46
_by
0.39
بÙĪØ§Ø³Ø·Ø©
0.36
tarafından
0.35
.by
0.32
edBy
0.32
Activations Density 0.187%