INDEX
Explanations
words and phrases indicating actions, conditions, or relationships in narrative contexts
New Auto-Interp
Negative Logits
ernes
-0.15
adius
-0.15
missible
-0.14
composition
-0.14
associ
-0.14
ideshow
-0.13
ickerView
-0.13
ẽ
-0.13
endoza
-0.13
agra
-0.13
POSITIVE LOGITS
daf
0.17
ris
0.15
èī
0.15
ulis
0.15
sb
0.14
enis
0.14
ึà¸ĩ
0.14
acher
0.14
ãĥ¬ãĥ³
0.14
zan
0.13
Activations Density 0.006%