INDEX
Explanations
references to charitable donations and fundraising activities
New Auto-Interp
Negative Logits
権
-0.15
æ¬Ĭ
-0.14
ruk
-0.14
æĿĥ
-0.13
venta
-0.13
ismet
-0.13
Deng
-0.13
uthor
-0.13
奴
-0.13
uai
-0.13
POSITIVE LOGITS
donation
0.64
donations
0.62
contributions
0.60
contribution
0.57
donating
0.51
donors
0.50
Contributions
0.49
contrib
0.49
Contrib
0.49
Donation
0.48
Activations Density 0.275%