INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     xúc
    -0.08
    ivel
    -0.08
    ęs
    -0.07
     любви
    -0.07
    ersistent
    -0.07
    :a
    -0.07
    াদক
    -0.07
     kehidupan
    -0.07
    .Xr
    -0.07
    幸福
    -0.07
    POSITIVE LOGITS
     einzelne
    0.15
     individuele
    0.13
     individual
    0.13
     individuales
    0.12
    individual
    0.12
     einzel
    0.12
     individually
    0.12
     Individual
    0.11
     отдельных
    0.11
     pojed
    0.11
    Act Density 0.065%

    No Known Activations