INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Happy
    -0.07
     orders
    -0.07
    yp
    -0.06
    287
    -0.06
     up
    -0.06
     yield
    -0.06
     eb
    -0.06
    'l
    -0.06
     Pilot
    -0.06
    352
    -0.06
    POSITIVE LOGITS
     France
    0.33
    France
    0.23
     france
    0.17
    rance
    0.12
    ance
    0.08
     Spain
    0.08
     Frances
    0.08
     Hollande
    0.07
     Фран
    0.07
    랑스
    0.07
    Act Density 0.006%

    No Known Activations