INDEX
Explanations
phrases or sentences expressing unity or collective belonging
phrases emphasizing inclusivity or collective involvement
New Auto-Interp
Negative Logits
fecture
-0.64
SHIP
-0.63
kson
-0.62
iggurat
-0.62
kamp
-0.62
notation
-0.61
potion
-0.61
culosis
-0.61
ernaut
-0.60
only
-0.59
POSITIVE LOGITS
stakeholders
1.07
parties
1.03
sorts
1.02
ocating
1.02
involved
1.01
kinds
1.01
facets
0.92
sides
0.91
genders
0.89
igators
0.89
Activations Density 0.081%