INDEX
    Explanations

    phrases related to bringing attention or providing additional information

    instances of the word "more" and phrases indicating additional information or continuation

    New Auto-Interp
    Negative Logits
    ij士
    -0.71
     Peaks
    -0.68
    ãĤ´ãĥ³
    -0.62
     Viper
    -0.62
     bye
    -0.61
     Penguins
    -0.61
     catcher
    -0.61
     Cros
    -0.59
     majority
    -0.58
    taboola
    -0.58
    POSITIVE LOGITS
     than
    0.97
    erous
    0.93
    than
    0.89
    uries
    0.80
    worldly
    0.79
    efficient
    0.77
    culated
    0.75
    minecraft
    0.74
    é¾įå
    0.74
    artments
    0.71
    Act Density 0.178%

    No Known Activations