INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     made
    0.55
     naïve
    0.48
     discontent
    0.48
     tying
    0.48
     logarithm
    0.47
     potion
    0.47
     He
    0.47
     brute
    0.46
     autosomal
    0.44
     be
    0.44
    POSITIVE LOGITS
    /
    1.88
    /(
    1.43
    -/
    1.40
    /_
    1.38
    /{
    1.37
    /[
    1.37
    1.35
    /%
    1.35
    /\
    1.34
    /)
    1.34
    Act Density 0.787%

    No Known Activations