INDEX
    Explanations

    mentions of the city Tokyo

    New Auto-Interp
    Negative Logits
    hemy
    -0.78
    inelli
    -0.74
    estern
    -0.73
    ebook
    -0.70
    edly
    -0.69
    theless
    -0.68
    mble
    -0.68
    onies
    -0.68
    ibilities
    -0.68
    rals
    -0.67
    POSITIVE LOGITS
     Babel
    0.81
     Dome
    0.80
     Lumpur
    0.77
     Gh
    0.74
     Xan
    0.71
    ichi
    0.70
    Tok
    0.68
     Bay
    0.67
    iji
    0.66
    jin
    0.65
    Act Density 0.011%

    No Known Activations