INDEX
Explanations
conjunctions and phrases indicating connection or association
New Auto-Interp
Negative Logits
ãĤ¦ãĥĪ
-0.15
erç
-0.12
ç¤
-0.12
âijł
-0.12
ï¼ļ"
-0.12
à¹Ħหà¸Ļ
-0.12
både
-0.12
oreach
-0.11
Òij
-0.11
bane
-0.11
POSITIVE LOGITS
its
0.60
Its
0.50
other
0.46
/or
0.44
Its
0.42
its
0.40
related
0.38
other
0.36
åħ¶
0.32
Other
0.31
Activations Density 0.278%