INDEX
    Explanations

    the word "whole" followed by a positive adjective

    New Auto-Interp
    Negative Logits
    intent
    -0.90
    ++++++++++++++++
    -0.88
     Feinstein
    -0.88
    endants
    -0.87
     inputs
    -0.85
     KE
    -0.85
    yip
    -0.85
    inator
    -0.83
    Downloadha
    -0.82
    rf
    -0.82
    POSITIVE LOGITS
    heartedly
    2.10
    hearted
    1.51
    meal
    1.23
     Foods
    1.13
    allo
    1.11
    whe
    1.07
    grown
    1.02
    ãĤ¨ãĥ«
    0.99
     swat
    0.99
    osaurus
    0.98
    Act Density 0.383%

    No Known Activations