INDEX
Explanations
mentioned contributions or actions made by individuals
verbs indicating contributions or actions taken by individuals
New Auto-Interp
Negative Logits
ocating
-0.75
adding
-0.68
rx
-0.66
PU
-0.66
iners
-0.65
tx
-0.61
arb
-0.60
Phys
-0.60
isol
-0.60
Compared
-0.60
POSITIVE LOGITS
nesday
0.64
Ĥİ
0.61
himself
0.61
isons
0.61
sublime
0.59
byss
0.59
sage
0.58
Pu
0.58
tuberculosis
0.57
likewise
0.57
Activations Density 0.399%