INDEX
Explanations
mentions of films and cinema-related terms
New Auto-Interp
Negative Logits
film
-0.22
Film
-0.21
films
-0.20
Films
-0.20
Ù
-0.18
filmed
-0.18
filmer
-0.18
filme
-0.18
filmes
-0.17
hips
-0.17
POSITIVE LOGITS
ic
0.37
noir
0.31
strip
0.29
akers
0.29
ography
0.26
aker
0.26
atic
0.24
stri
0.23
aking
0.22
fare
0.22
Activations Density 0.046%