INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    compare
    -0.07
     why
    -0.07
    ACHINE
    -0.07
    概念
    -0.07
    ):
    -0.06
     спир
    -0.06
     backstory
    -0.06
     reasons
    -0.06
    (clone
    -0.06
    (filepath
    -0.06
    POSITIVE LOGITS
     quelque
    0.07
    hq
    0.07
    alarda
    0.07
    anyahu
    0.07
    Uno
    0.07
    nj
    0.06
     Netanyahu
    0.06
    .job
    0.06
    Decre
    0.06
     AK
    0.06
    Act Density 0.021%

    No Known Activations