INDEX
Explanations
phrases showing empathy, support, or concern for different groups of people
contexts related to global impact and collective consequences
New Auto-Interp
Negative Logits
etting
-0.82
oaded
-0.73
worn
-0.71
Vers
-0.62
pmwiki
-0.62
illac
-0.61
oult
-0.60
imilar
-0.58
disse
-0.58
utenberg
-0.58
POSITIVE LOGITS
humankind
1.12
wider
1.10
society
1.04
mankind
1.03
everybody
1.02
anybody
0.97
everyone
0.96
broader
0.95
humanity
0.95
entire
0.92
Activations Density 0.281%