INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    alsh
    -0.77
    etheless
    -0.77
     Kear
    -0.73
    ategory
    -0.69
     Goods
    -0.67
     Inqu
    -0.67
    aples
    -0.64
     Dhabi
    -0.64
    agons
    -0.64
    elsius
    -0.63
    POSITIVE LOGITS
    writer
    1.41
    writers
    1.36
     lyrics
    1.35
    writing
    1.32
    stress
    1.20
     lyric
    1.16
    bird
    1.14
     song
    1.08
     songs
    1.04
    birds
    1.04
    Act Density 0.015%

    No Known Activations