INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    idge
    -0.08
     фундамент
    -0.07
    _eg
    -0.07
    -0.07
    ียญ
    -0.07
     Svens
    -0.07
     Nielsen
    -0.06
     lavoro
    -0.06
     podmínky
    -0.06
    -0.06
    POSITIVE LOGITS
     QU
    0.07
    inant
    0.06
    xca
    0.06
     Preparation
    0.06
     interfering
    0.06
    .Photo
    0.06
     Harmon
    0.06
     phased
    0.06
    -phone
    0.06
     halluc
    0.06
    Act Density 0.012%

    No Known Activations