INDEX
    Explanations

    occurrences of the word "new" in different contexts

    New Auto-Interp
    Negative Logits
    anho
    -0.59
    ைகள்
    -0.56
    deburg
    -0.51
     Andi
    -0.50
    úgó
    -0.49
    dings
    -0.48
    ensa
    -0.48
    cious
    -0.48
     surely
    -0.47
    úsqueda
    -0.47
    POSITIVE LOGITS
    new
    1.53
     new
    1.05
    Hentet
    0.90
     للاسماء
    0.81
     neuen
    0.78
    NEW
    0.76
     nieuwe
    0.74
     новой
    0.73
    neue
    0.73
    GraphicsUnit
    0.73
    Act Density 0.074%

    No Known Activations