INDEX
    Explanations

    instances of the word "stopped"

    New Auto-Interp
    Negative Logits
    é¾įåĸļ士
    -0.74
    aths
    -0.70
     Coliseum
    -0.68
    ighth
    -0.68
    arov
    -0.67
    Sov
    -0.66
    arden
    -0.65
    eer
    -0.64
    rocket
    -0.62
    adier
    -0.60
    POSITIVE LOGITS
     bothering
    1.03
     abruptly
    0.94
    watch
    0.92
    gap
    0.84
     breathing
    0.83
    watching
    0.81
     raining
    0.78
     blinking
    0.77
     worrying
    0.76
     laughing
    0.75
    Act Density 0.025%

    No Known Activations