INDEX
    Explanations

    words related to specific locations or entities such as "Atelier" and "Hamas" sitting at the top of activations

    words related to artistic and creative activities

    New Auto-Interp
    Negative Logits
     Weasley
    -0.60
     Cinderella
    -0.60
     winters
    -0.59
     Cosmos
    -0.59
     Ghostbusters
    -0.58
    verb
    -0.57
     Chao
    -0.56
    ruary
    -0.56
    glomer
    -0.56
     towels
    -0.56
    POSITIVE LOGITS
    mast
    1.04
    antage
    0.78
     mast
    0.76
    elia
    0.74
    llah
    0.73
    obl
    0.72
    rophe
    0.70
    rius
    0.69
    oma
    0.69
    igible
    0.69
    Act Density 0.055%

    No Known Activations