INDEX
    Explanations

    mentions of the word "French" and related references to France

    New Auto-Interp
    Negative Logits
    jit
    -0.17
    think
    -0.15
    jte
    -0.15
    nbsp
    -0.15
    rel
    -0.15
    ded
    -0.15
    oted
    -0.15
    reu
    -0.14
    ationToken
    -0.14
    readcr
    -0.14
    POSITIVE LOGITS
    -speaking
    0.21
    man
    0.17
    ostel
    0.15
    ysz
    0.15
    esy
    0.15
    making
    0.14
    phone
    0.14
    IRO
    0.14
    ake
    0.14
    men
    0.14
    Act Density 0.091%

    No Known Activations