INDEX
    Explanations

    references to popular films and television series

    New Auto-Interp
    Negative Logits
     Ney
    -0.16
    ä½
    -0.15
     Hunger
    -0.15
    arella
    -0.15
    zew
    -0.14
     Sao
    -0.14
    å¸Ń
    -0.14
    ickle
    -0.14
    cka
    -0.14
     Economist
    -0.13
    POSITIVE LOGITS
    atel
    0.16
    976
    0.15
     pol
    0.15
    ãĥ¼ãĥĵ
    0.14
     terminal
    0.14
    лиз
    0.14
     McCabe
    0.14
    iek
    0.14
    ibs
    0.13
     Franken
    0.13
    Act Density 0.076%

    No Known Activations