INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    cence
    -0.07
    winter
    -0.06
     sight
    -0.06
     lumin
    -0.06
    	counter
    -0.06
     numb
    -0.06
    407
    -0.06
    Intro
    -0.06
    toContain
    -0.06
     Nir
    -0.06
    POSITIVE LOGITS
     deal
    0.12
    Deal
    0.12
     Deal
    0.11
     deals
    0.09
    deal
    0.09
     DEAL
    0.08
     Deals
    0.08
    .alloc
    0.07
    bol
    0.07
    ила
    0.07
    Act Density 0.014%

    No Known Activations