INDEX
    Explanations

    phrases expressing similarity or comparison

    New Auto-Interp
    Negative Logits
    urus
    -0.17
    PLEX
    -0.16
    istrovstvÃŃ
    -0.14
    mic
    -0.14
     Schro
    -0.14
     fro
    -0.14
     Merchant
    -0.13
    iscard
    -0.13
    scribe
    -0.13
     merchant
    -0.13
    POSITIVE LOGITS
    llx
    0.15
    .variables
    0.15
    Outlined
    0.15
     heraus
    0.14
    antry
    0.14
     Giul
    0.14
     Kens
    0.14
    .nlm
    0.13
    rollo
    0.13
    idf
    0.13
    Act Density 0.014%

    No Known Activations