INDEX
Explanations
references to Valentine's Day
New Auto-Interp
Negative Logits
inoa
-0.16
ahren
-0.15
imbus
-0.15
liver
-0.14
Liver
-0.14
inator
-0.14
Pessoa
-0.14
ildren
-0.14
Lon
-0.14
shit
-0.14
POSITIVE LOGITS
couples
0.31
Couples
0.31
romantic
0.27
Couple
0.27
romance
0.25
Romantic
0.24
couple
0.23
Romance
0.22
Rom
0.21
Valentine
0.21
Activations Density 0.130%