INDEX
    Explanations

    phrases that indicate drawing inspiration or connections from various sources

    New Auto-Interp
    Negative Logits
    ovah
    -0.18
    ature
    -0.17
    udas
    -0.16
    iment
    -0.15
    aya
    -0.15
    oy
    -0.15
    amas
    -0.14
    oyal
    -0.14
    è½½
    -0.14
    iments
    -0.14
    POSITIVE LOGITS
     draw
    0.31
     draws
    0.30
     drawn
    0.30
     attention
    0.30
     Draw
    0.28
     Draws
    0.28
    .Draw
    0.27
    attention
    0.27
    .draw
    0.26
     drew
    0.26
    Act Density 0.021%

    No Known Activations