INDEX
    Explanations

    references to named entities, particularly those that are associated with food or cultural concepts

    New Auto-Interp
    Negative Logits
     Чи
    -0.16
    ourn
    -0.14
    ìĤ¬ìĿ´
    -0.13
    мм
    -0.13
    à¹ģà¸Ļ
    -0.13
    xa
    -0.13
     adlı
    -0.13
    etik
    -0.13
    çģ
    -0.13
    uli
    -0.13
    POSITIVE LOGITS
     simply
    0.42
     simplement
    0.31
     пÑĢоÑģÑĤо
    0.29
     "
    0.27
    :
    0.24
     kıs
    0.22
    -called
    0.21
     prostÄĽ
    0.21
     merely
    0.21
     Simply
    0.21
    Act Density 0.278%

    No Known Activations