INDEX
    Explanations

    elements related to characters and their actions in films

    New Auto-Interp
    Negative Logits
    _mirror
    -0.14
    rip
    -0.14
    pora
    -0.14
    hte
    -0.14
    presso
    -0.14
    rips
    -0.13
    ÑĥмÑĥ
    -0.13
    ستر
    -0.13
    Slave
    -0.13
    clist
    -0.13
    POSITIVE LOGITS
     handling
    0.15
    tal
    0.15
    ì½ľ
    0.14
    алÑĸв
    0.14
     hab
    0.14
    atoria
    0.14
    orp
    0.14
    297
    0.14
    467
    0.14
    oni
    0.13
    Act Density 0.069%

    No Known Activations