INDEX
Explanations
mentions of illicit romantic relationships
phrases relating to situations characterized by having or experiencing
New Auto-Interp
Negative Logits
arning
-0.75
ensing
-0.71
NESS
-0.70
adj
-0.65
Posts
-0.64
quartered
-0.64
available
-0.64
yond
-0.64
aware
-0.63
UC
-0.63
POSITIVE LOGITS
meltdown
0.98
chance
0.95
iphany
0.93
intercourse
0.89
conversation
0.88
affair
0.86
conversations
0.85
dealings
0.85
blast
0.84
luck
0.84
Activations Density 0.212%