INDEX
    Explanations

    references to George Orwell and his works

    New Auto-Interp
    Negative Logits
    phy
    -0.16
    unto
    -0.15
    оваÑĢи
    -0.14
    eres
    -0.14
    .navigator
    -0.14
    ιά
    -0.14
    apol
    -0.14
    UTO
    -0.13
    NSE
    -0.13
     Verg
    -0.13
    POSITIVE LOGITS
    linkplain
    0.16
    xit
    0.15
    aight
    0.15
    ughter
    0.14
    znam
    0.14
    åı
    0.13
     Hanna
    0.13
    лиж
    0.13
    ptal
    0.13
    /stdc
    0.13
    Act Density 0.033%

    No Known Activations