INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    PATH
    -0.70
     iod
    -0.69
    sburg
    -0.65
    methyl
    -0.62
    RAY
    -0.62
    ansom
    -0.61
    kus
    -0.60
    INGS
    -0.58
    ologically
    -0.58
    female
    -0.58
    POSITIVE LOGITS
     cafe
    1.16
     cafes
    1.14
     café
    1.05
     Cafe
    0.99
     caf
    0.97
     racer
    0.90
    eteria
    0.89
     Café
    0.87
     kios
    0.86
    ecake
    0.85
    Act Density 0.009%

    No Known Activations