INDEX
Explanations
prepositions indicating spatial or temporal relationships
New Auto-Interp
Negative Logits
üz
-0.17
_blob
-0.15
_almost
-0.15
arf
-0.14
uzu
-0.14
onya
-0.14
porto
-0.14
ÏĢιÏĥ
-0.14
.stamp
-0.14
meisjes
-0.14
POSITIVE LOGITS
odds
0.28
fault
0.24
risk
0.23
advantage
0.22
witter
0.22
ease
0.21
pains
0.21
logger
0.20
peace
0.20
liberty
0.20
Activations Density 0.042%