INDEX
Explanations
quantitative references to monetary values or amounts
New Auto-Interp
Negative Logits
yon
-0.16
óÅĤ
-0.15
rang
-0.15
lamp
-0.15
haps
-0.15
itz
-0.15
098
-0.15
born
-0.14
apers
-0.14
punk
-0.14
POSITIVE LOGITS
alist
0.17
íģ¼
0.16
ÑĢеÑĪ
0.15
alara
0.15
ahun
0.15
shadow
0.14
_shadow
0.14
许
0.14
imals
0.14
unts
0.14
Activations Density 0.049%