INDEX
Explanations
locations or settings where people interact socially, possibly leading to conflicts
scenarios involving social interactions and conflicts
New Auto-Interp
Negative Logits
elin
-0.83
ilty
-0.74
ector
-0.72
eele
-0.71
ected
-0.65
Pwr
-0.64
Assets
-0.64
atts
-0.64
assets
-0.63
attribute
-0.63
POSITIVE LOGITS
conversation
1.66
discussion
1.57
discussions
1.52
banter
1.47
conversations
1.45
Conversation
1.42
disagreement
1.38
debate
1.37
Discussion
1.34
dialogue
1.33
Activations Density 0.916%