INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Worm
    -0.07
    Plane
    -0.06
     března
    -0.06
     Julius
    -0.06
     ein
    -0.06
     fishermen
    -0.06
     rejected
    -0.06
     Newly
    -0.06
     біля
    -0.06
    Indian
    -0.06
    POSITIVE LOGITS
    59
    0.07
    macros
    0.06
     compos
    0.06
    prix
    0.06
     komp
    0.06
     exped
    0.06
    _DIR
    0.06
     يا
    0.06
    charged
    0.06
     rims
    0.06
    Act Density 0.012%

    No Known Activations