INDEX
Explanations
references to monetary values or salaries
New Auto-Interp
Negative Logits
ure
-0.17
rix
-0.17
oly
-0.16
ÏħÏĥ
-0.15
سÙĨ
-0.15
íĮĮ
-0.15
ri
-0.15
autoload
-0.14
еÑĢеж
-0.14
rint
-0.14
POSITIVE LOGITS
anus
0.18
letcher
0.17
cola
0.16
apolis
0.16
erver
0.15
stown
0.15
arges
0.15
idel
0.15
ichten
0.14
ingleton
0.14
Activations Density 0.047%