INDEX
Explanations
phrases related to specific movies, TV shows, and actors
references to specific movies and pop culture elements
New Auto-Interp
Negative Logits
submar
-0.78
appointing
-0.77
referen
-0.73
umably
-0.71
ibly
-0.70
defect
-0.70
declining
-0.69
rower
-0.69
heastern
-0.68
sponsoring
-0.67
POSITIVE LOGITS
Circus
1.00
Nights
0.95
Deadly
0.95
Forever
0.94
Girl
0.93
Madness
0.93
Mama
0.91
Night
0.91
Kids
0.90
Secrets
0.90
Activations Density 0.795%