INDEX
    Explanations

    phrases related to causality or consequence

    occurrences of the word "in" and its different contexts

    New Auto-Interp
    Negative Logits
    resa
    -0.86
    peria
    -0.84
    roup
    -0.77
    arter
    -0.75
    hyde
    -0.74
    arty
    -0.70
    rying
    -0.69
    zing
    -0.68
    aley
    -0.67
     Ending
    -0.65
    POSITIVE LOGITS
    pires
    0.85
     turns
    0.84
     turned
    0.78
     translates
    0.76
     incidentally
    0.75
     happens
    0.74
     frankly
    0.71
     resembles
    0.68
     admittedly
    0.68
     inexpl
    0.66
    Act Density 0.104%

    No Known Activations