INDEX
Explanations
instances of participation or involvement in activities
New Auto-Interp
Negative Logits
yling
-0.17
eking
-0.16
eve
-0.15
egov
-0.14
ợ
-0.14
reasonable
-0.14
доÑĤ
-0.14
bil
-0.14
endency
-0.14
ertino
-0.14
POSITIVE LOGITS
ail
0.17
Tape
0.16
ischer
0.15
tape
0.15
amente
0.14
uncomment
0.14
Dana
0.14
widow
0.14
activities
0.14
such
0.14
Activations Density 0.047%