INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Levant
    -0.88
     confir
    -0.65
     coord
    -0.62
     stre
    -0.61
     pretext
    -0.60
     bloodstream
    -0.59
     condolences
    -0.58
     princ
    -0.58
     Palest
    -0.58
     Lens
    -0.58
    POSITIVE LOGITS
    boa
    0.88
    xus
    0.87
    uo
    0.84
    anu
    0.79
    chwitz
    0.79
    agy
    0.78
    kef
    0.78
    agra
    0.77
    pg
    0.75
    _-
    0.74
    Act Density 0.041%

    No Known Activations