INDEX
    Explanations

    configurations, features, roles

    New Auto-Interp
    Negative Logits
    вшей
    -0.68
    ابعة
    -0.64
     劉
    -0.64
    hime
    -0.64
     entirely
    -0.63
     tenté
    -0.63
    owners
    -0.63
    อยู่ใน
    -0.63
     Merrill
    -0.62
    ImageIO
    -0.62
    POSITIVE LOGITS
    rof
    0.76
     beagle
    0.70
     gypsum
    0.68
    agara
    0.67
    Storia
    0.66
     апте
    0.66
     Replica
    0.65
    Yor
    0.65
    わけではない
    0.64
     carts
    0.64
    Act Density 0.054%

    No Known Activations