INDEX
    Explanations

    references to actors and their roles in various contexts

    New Auto-Interp
    Negative Logits
    ader
    -0.19
    erable
    -0.19
    erator
    -0.18
    est
    -0.18
    ned
    -0.17
    seo
    -0.16
    394
    -0.16
    coming
    -0.16
    eration
    -0.16
    Acts
    -0.15
    POSITIVE LOGITS
    uate
    0.20
    -direct
    0.19
    roles
    0.19
     roles
    0.18
     Roles
    0.18
    /music
    0.18
    uated
    0.17
    /model
    0.17
    /wait
    0.16
    uating
    0.16
    Act Density 0.013%

    No Known Activations