INDEX
Explanations
movie-related terms
references to movies or the word "movie."
New Auto-Interp
Negative Logits
ilities
-0.79
raham
-0.71
sembly
-0.69
Haitian
-0.68
urst
-0.68
withstanding
-0.68
otic
-0.68
cffff
-0.68
ords
-0.67
functional
-0.66
POSITIVE LOGITS
theater
1.08
theaters
1.07
goers
1.06
movies
1.00
theatre
0.98
movie
0.96
theat
0.89
eers
0.89
premie
0.88
Movies
0.88
Activations Density 0.039%