INDEX
Explanations
phrases related to titles of books, movies, and journals
phrases that reference the structure or content of books, movies, or other media
New Auto-Interp
Negative Logits
rouse
-0.78
sympath
-0.74
advis
-0.74
eater
-0.73
balcon
-0.72
versa
-0.70
offending
-0.70
entimes
-0.70
jaws
-0.69
brake
-0.69
POSITIVE LOGITS
Names
1.09
Excellence
1.03
Life
1.03
Them
1.03
Everyday
1.02
Yourself
1.02
Lies
1.00
Violence
1.00
Tomorrow
1.00
Madness
0.99
Activations Density 0.180%