INDEX
Explanations
mentions of social media posts or written communication
references to communication and social media interactions
New Auto-Interp
Negative Logits
Occupations
-0.76
tics
-0.75
Empires
-0.70
sexes
-0.66
Cups
-0.66
Steps
-0.63
Buildings
-0.63
ernels
-0.63
Ones
-0.63
wards
-0.63
POSITIVE LOGITS
titled
0.94
called
0.90
involving
0.87
resembling
0.84
consisting
0.81
wherein
0.81
aimed
0.76
whereby
0.76
assian
0.76
entitled
0.74
Activations Density 0.547%