INDEX
    Explanations

    references to popular culture and media

    New Auto-Interp
    Negative Logits
    amble
    -0.16
     *----------------------------------------------------------------
    -0.15
    kop
    -0.15
    usal
    -0.14
    heck
    -0.14
    quer
    -0.14
    allen
    -0.13
    ारण
    -0.13
    byt
    -0.13
    nesty
    -0.13
    POSITIVE LOGITS
    uzey
    0.14
     doby
    0.14
    лава
    0.13
     деÑĢ
    0.13
     Dell
    0.13
     hete
    0.13
    .chunk
    0.13
     desar
    0.13
    enaire
    0.13
     Hend
    0.13
    Act Density 0.100%

    No Known Activations