INDEX
    Explanations

    occurrences of the word "Moscow" or variations of it

    New Auto-Interp
    Negative Logits
    wares
    -0.15
    ace
    -0.15
     Cyc
    -0.15
     Krish
    -0.15
    able
    -0.14
    tin
    -0.14
    isable
    -0.14
    ables
    -0.13
    ertext
    -0.13
    Mocks
    -0.13
    POSITIVE LOGITS
    quito
    0.29
    ambique
    0.23
    aic
    0.19
    cow
    0.19
    lems
    0.18
    phere
    0.17
    ëŀĺ
    0.17
    ADDE
    0.17
    quit
    0.17
    elm
    0.17
    Act Density 0.014%

    No Known Activations