INDEX
Explanations
references to specific movies or their notable attributes
New Auto-Interp
Negative Logits
olu
-0.17
izzo
-0.17
erno
-0.15
antino
-0.15
ahren
-0.15
ello
-0.15
oso
-0.15
Jvm
-0.15
jos
-0.15
inus
-0.14
POSITIVE LOGITS
power
0.17
horizontal
0.17
-power
0.16
Horizontal
0.15
poder
0.15
Power
0.15
POWER
0.15
scaleY
0.15
del
0.15
age
0.15
Activations Density 0.023%