INDEX
    Explanations

    mentions of the city Paris in various contexts

    New Auto-Interp
    Negative Logits
    tir
    -0.19
    tor
    -0.17
    halt
    -0.16
    yun
    -0.16
    amber
    -0.16
    eer
    -0.16
    eous
    -0.16
    dong
    -0.15
    tings
    -0.15
    tors
    -0.15
    POSITIVE LOGITS
    ian
    0.31
    ians
    0.23
    IAN
    0.20
     Hilton
    0.20
    ienne
    0.20
    cope
    0.19
    Ø©
    0.17
    itic
    0.17
    ien
    0.16
    ney
    0.16
    Act Density 0.008%

    No Known Activations