INDEX
Explanations
references to films and movies
New Auto-Interp
Negative Logits
inmun
-0.55
groen
-0.54
tonsoft
-0.54
etragen
-0.51
ävän
-0.51
interesse
-0.49
arché
-0.49
attes
-0.49
нтів
-0.48
petto
-0.47
POSITIVE LOGITS
film
2.93
movie
2.79
Film
2.72
film
2.66
films
2.65
movies
2.61
Film
2.60
Movie
2.55
movie
2.51
Movie
2.46
Activations Density 0.119%