INDEX
    Explanations

    phrases indicating openness or accessibility

    New Auto-Interp
    Negative Logits
    tram
    -0.15
    ering
    -0.15
    šak
    -0.15
    741
    -0.15
    ÑĢак
    -0.14
    DMI
    -0.14
    usch
    -0.14
    cial
    -0.14
    ä¹³
    -0.14
    ık
    -0.13
    POSITIVE LOGITS
    phins
    0.15
    ì²ľ
    0.14
    hart
    0.14
    ognito
    0.14
     culo
    0.14
     Lis
    0.13
    æľĿ
    0.13
    æķ
    0.13
    elps
    0.13
    Å
    0.13
    Act Density 0.017%

    No Known Activations