INDEX
Explanations
mentions of films and their attributes
New Auto-Interp
Negative Logits
zw
-0.17
ecurity
-0.16
embre
-0.15
Abed
-0.15
.Toolkit
-0.15
окÑĢем
-0.15
triang
-0.14
áb
-0.14
haul
-0.14
.lesson
-0.14
POSITIVE LOGITS
Tel
0.23
hero
0.22
tol
0.22
heroine
0.21
mass
0.20
keer
0.20
Mega
0.20
tel
0.20
interval
0.20
Hero
0.20
Activations Density 0.011%