INDEX
    Explanations

    references to reality television, specifically related to the "Real Housewives" franchise

    New Auto-Interp
    Negative Logits
    subs
    -0.17
    rale
    -0.16
    åĽ£
    -0.16
     Downs
    -0.15
    esy
    -0.14
    hed
    -0.14
    iban
    -0.14
    onder
    -0.13
    UTERS
    -0.13
    lu
    -0.13
    POSITIVE LOGITS
    731
    0.16
    dzi
    0.15
    -REAL
    0.15
    INF
    0.14
    ¬
    0.14
    ynı
    0.14
    725
    0.14
    ìľ¤
    0.14
    Ù쨹
    0.14
    _WATCH
    0.14
    Act Density 0.012%

    No Known Activations