INDEX
    Explanations

    references to reality television shows and their elements

    New Auto-Interp
    Negative Logits
    atown
    -0.16
    avern
    -0.15
    ouro
    -0.15
    rawer
    -0.15
    úa
    -0.15
    ivable
    -0.15
    mares
    -0.15
    atism
    -0.15
    оÑĢаÑı
    -0.14
     поба
    -0.14
    POSITIVE LOGITS
     daily
    0.17
    693
    0.17
     contestants
    0.16
    asına
    0.16
    villa
    0.15
    692
    0.15
     contestant
    0.14
     Mon
    0.14
    ç̬
    0.14
    hooks
    0.14
    Act Density 0.002%

    No Known Activations