INDEX
Explanations
phrases emphasizing the presence of nouns and descriptors in various contexts
New Auto-Interp
Negative Logits
/Form
-0.16
άÏģ
-0.15
Slee
-0.15
oad
-0.15
ond
-0.14
گر
-0.14
OND
-0.14
ubber
-0.14
geb
-0.14
endum
-0.14
POSITIVE LOGITS
linger
0.18
ishi
0.17
agrams
0.16
addresses
0.15
ausal
0.15
okol
0.15
stants
0.14
zos
0.14
uras
0.14
quisitions
0.14
Activations Density 0.278%