INDEX
Explanations
terms related to donation, charity, and humanitarian work
terms related to roles and organizational structures
New Auto-Interp
Negative Logits
fman
-0.63
arnaev
-0.55
ppa
-0.54
igl
-0.53
xon
-0.52
jah
-0.51
ospital
-0.51
Spoon
-0.50
kef
-0.50
ora
-0.50
POSITIVE LOGITS
itself
0.72
consists
0.57
himself
0.57
herself
0.56
yourself
0.56
behaves
0.53
pedia
0.52
themselves
0.51
ourselves
0.51
Himself
0.51
Activations Density 1.245%