INDEX
Explanations
references to superhero movies and their production details
New Auto-Interp
Negative Logits
岸
-0.16
Klopp
-0.15
ski
-0.14
Bun
-0.14
одав
-0.14
hausen
-0.14
pai
-0.14
ÑĪиб
-0.14
microbi
-0.13
ghi
-0.13
POSITIVE LOGITS
Justice
0.40
DC
0.36
Justice
0.34
Aqu
0.32
DC
0.31
Snyder
0.30
justice
0.30
justice
0.29
Warner
0.28
Aqu
0.28
Activations Density 0.015%