INDEX
    Explanations

    words related to images or symbols

    New Auto-Interp
    Negative Logits
    itel
    -0.18
    i
    -0.18
    erate
    -0.17
    iou
    -0.16
    e
    -0.16
    al
    -0.15
    aliz
    -0.15
    er
    -0.15
    QA
    -0.14
    appen
    -0.14
    POSITIVE LOGITS
    othy
    0.25
    ergic
    0.17
     çī
    0.17
    pressions
    0.16
    jal
    0.16
    iliki
    0.16
    PLE
    0.16
    ENSION
    0.15
    rod
    0.15
    yasal
    0.15
    Act Density 0.084%

    No Known Activations