INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     flux
    -0.07
     WITH
    -0.07
     genius
    -0.06
     with
    -0.06
     Kidd
    -0.06
     svg
    -0.06
     fetus
    -0.06
     бізнес
    -0.06
    نسا
    -0.06
     condemnation
    -0.06
    POSITIVE LOGITS
     are
    0.20
     Are
    0.17
     were
    0.15
    Are
    0.15
     ARE
    0.14
    are
    0.13
     Were
    0.13
    were
    0.12
    ARE
    0.11
     aren
    0.11
    Act Density 0.380%

    No Known Activations