INDEX
    Explanations

    references to celebrities

    references to celebrities

    New Auto-Interp
    Negative Logits
    Blocks
    -0.75
     Flow
    -0.72
    sis
    -0.68
    nav
    -0.68
    cture
    -0.67
     Mines
    -0.67
    ieves
    -0.66
    flow
    -0.66
    odes
    -0.65
    mol
    -0.63
    POSITIVE LOGITS
     celebrity
    3.39
     celebrities
    2.63
     celeb
    2.49
     Celebrity
    2.29
     cele
    1.95
     Celeb
    1.93
     fame
    1.57
    cele
    1.55
    Cele
    1.51
     superstar
    1.46
    Act Density 0.014%

    No Known Activations