INDEX
    Explanations

    expressions of encouragement or gratitude

    New Auto-Interp
    Negative Logits
    bang
    -0.18
    utenberg
    -0.17
    _Version
    -0.16
    اÙĨا
    -0.16
    lood
    -0.15
    ebi
    -0.15
    ÑĩаÑĤ
    -0.15
    ems
    -0.15
    ugo
    -0.15
    ags
    -0.14
    POSITIVE LOGITS
     vice
    0.15
     meaningful
    0.14
    phe
    0.14
    ascar
    0.14
     VP
    0.13
     meaning
    0.13
    Structured
    0.13
     داÙĨÙĦÙĪØ¯
    0.13
    è¶³
    0.13
     Sah
    0.13
    Act Density 0.119%

    No Known Activations