INDEX
Explanations
phrases related to financial transactions or donations
New Auto-Interp
Negative Logits
936
-0.18
IFO
-0.16
emek
-0.15
jee
-0.15
wu
-0.15
eless
-0.14
_gid
-0.14
lias
-0.14
wat
-0.14
idot
-0.14
POSITIVE LOGITS
perator
0.17
asl
0.15
utherland
0.15
ActivityResult
0.14
HONE
0.14
uario
0.14
алом
0.13
igne
0.13
Ñĥка
0.13
adge
0.13
Activations Density 0.240%