INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Ore
    -0.08
     Kum
    -0.08
    -0.08
     Ae
    -0.08
    Aj
    -0.07
     ac
    -0.07
     distant
    -0.07
     Polk
    -0.07
    AJ
    -0.07
     Ore
    -0.07
    POSITIVE LOGITS
     Elsa
    0.09
    0.08
     lover
    0.08
     balt
    0.08
    0.07
     wadd
    0.07
     fiscal
    0.07
    237
    0.07
     Kath
    0.07
     Luxembourg
    0.07
    Act Density 0.002%

    No Known Activations