INDEX
Explanations
movie-related terms and references
mentions of movies
New Auto-Interp
Negative Logits
existent
-0.77
acia
-0.74
cffff
-0.72
adies
-0.71
odic
-0.71
omaly
-0.68
minded
-0.66
avis
-0.65
isol
-0.64
isms
-0.63
POSITIVE LOGITS
theaters
1.22
theater
1.17
movie
1.16
movies
1.08
movie
1.01
theatre
1.01
Movie
0.99
goers
0.96
Movie
0.95
theat
0.95
Activations Density 0.022%