INDEX
Explanations
instances of film-related information and mentions of specific celebrities
New Auto-Interp
Negative Logits
bsub
-0.17
Blasio
-0.15
ugin
-0.15
ÃŃcio
-0.15
ohon
-0.14
Liberation
-0.14
Pearce
-0.14
ανά
-0.14
olian
-0.14
ector
-0.14
POSITIVE LOGITS
Werner
0.15
ivy
0.15
armour
0.15
éŀĭ
0.15
elper
0.14
iali
0.14
ạn
0.14
chen
0.14
eft
0.14
vil
0.14
Activations Density 0.111%