INDEX
    Explanations

    phrases related to introductions and presenting people or concepts

    New Auto-Interp
    Negative Logits
    ma
    -0.70
    رى
    -0.66
    na
    -0.66
     na
    -0.63
    ar
    -0.63
    5
    -0.63
    m
    -0.63
    mo
    -0.62
    pis
    -0.61
     m
    -0.61
    POSITIVE LOGITS
     Introduce
    1.71
     introduces
    1.61
     introductions
    1.58
    Introduce
    1.56
    introduce
    1.54
     introduction
    1.53
     introduce
    1.52
     introdu
    1.48
     Introducing
    1.47
     introducing
    1.47
    Act Density 0.066%

    No Known Activations