INDEX
Explanations
references to movies
mentions of the word "movie" in various contexts
New Auto-Interp
Negative Logits
ilities
-0.88
ords
-0.74
Haitian
-0.72
odox
-0.70
sembly
-0.68
otics
-0.68
withstanding
-0.68
²¾
-0.67
anting
-0.66
cffff
-0.64
POSITIVE LOGITS
theater
1.15
goers
1.11
theaters
1.10
theatre
1.06
theat
0.96
premie
0.93
going
0.91
movies
0.91
eers
0.87
movie
0.87
Activations Density 0.067%