INDEX
Explanations
phrases indicating changes or developments in a situation
New Auto-Interp
Negative Logits
Σα
-0.16
okemon
-0.14
.jet
-0.14
кÑĥлÑĮ
-0.14
å²
-0.13
ivol
-0.13
edly
-0.13
/lg
-0.13
اظ
-0.13
ingo
-0.13
POSITIVE LOGITS
ļĮ
0.16
gen
0.15
Organisation
0.15
0.13
iliar
0.13
Ĵ
0.13
rows
0.13
usta
0.13
roe
0.13
rome
0.13
Activations Density 0.068%