INDEX
    Explanations

    Multiple languages

    New Auto-Interp
    Negative Logits
     teaches
    -0.09
    なが
    -0.08
     jauh
    -0.08
     slightest
    -0.08
    ’র
    -0.08
    -worth
    -0.08
     loin
    -0.08
     equips
    -0.07
    한다
    -0.07
    Tip
    -0.07
    POSITIVE LOGITS
     directly
    0.09
     सीधे
    0.09
     sogenannte
    0.08
     Amish
    0.08
    .flip
    0.07
     Farmer
    0.07
     Jehovah
    0.07
    vip
    0.07
     directamente
    0.07
     Vulkan
    0.07
    Act Density 0.180%

    No Known Activations