INDEX
Explanations
expressions of personal beliefs about marriage and relationships
New Auto-Interp
Negative Logits
ought
-0.17
fol
-0.15
folks
-0.15
perhaps
-0.14
darn
-0.14
aalborg
-0.14
jte
-0.14
rather
-0.13
LOB
-0.13
acht
-0.13
POSITIVE LOGITS
0.21
['
0.19
fucking
0.19
TMZ
0.19
0.18
IMDb
0.17
fuck
0.17
[
0.17
girl
0.16
BTS
0.16
Activations Density 0.124%