INDEX
    Explanations

    references to types of wine

    New Auto-Interp
    Negative Logits
    loit
    -0.16
    ãng
    -0.15
    elu
    -0.15
    iÄĻ
    -0.14
     national
    -0.14
    oten
    -0.14
    .zh
    -0.14
    lobs
    -0.14
    iges
    -0.14
     censor
    -0.14
    POSITIVE LOGITS
     Gone
    0.17
     Alive
    0.15
    aroo
    0.15
    éĦī
    0.14
    eras
    0.14
    opak
    0.14
     spaced
    0.14
    rip
    0.13
    ìŀ¡
    0.13
    ripp
    0.13
    Act Density 0.003%

    No Known Activations