INDEX
    Explanations

    phrases related to directional movement or transition

    instances of the word "into" and related context

    New Auto-Interp
    Negative Logits
     icing
    -0.64
     due
    -0.64
     opinion
    -0.62
     sessions
    -0.61
     parties
    -0.61
     DD
    -0.61
     eval
    -0.60
     chat
    -0.60
     breakout
    -0.60
     Warm
    -0.59
    POSITIVE LOGITS
    into
    3.70
    onto
    1.01
     Into
    1.00
    inside
    0.97
    hiba
    0.82
    lda
    0.82
    ever
    0.82
    indu
    0.81
     INTO
    0.80
    INT
    0.79
    Act Density 0.009%

    No Known Activations