INDEX
    Explanations

    phrases indicating a sequence of events

    New Auto-Interp
    Negative Logits
     stakes
    -0.29
    ility
    -0.29
     welf
    -0.28
    Winged
    -0.28
     contradiction
    -0.28
     encouragement
    -0.28
    fighter
    -0.27
    ciplinary
    -0.27
     Peninsula
    -0.27
    borgh
    -0.27
    POSITIVE LOGITS
    Ń·
    0.44
    abouts
    0.41
    ettings
    0.39
    EStream
    0.39
    atra
    0.37
    utm
    0.36
    orm
    0.36
    orthy
    0.36
    nih
    0.36
    daq
    0.34
    Act Density 11.465%

    No Known Activations