INDEX
    Explanations

    the presence of the string "ada" in various forms

    New Auto-Interp
    Negative Logits
    names
    -0.73
    icles
    -0.71
    giving
    -0.69
    sheet
    -0.68
    rophic
    -0.67
    ãĤĮ
    -0.66
    mother
    -0.65
    ician
    -0.65
    taking
    -0.65
    URES
    -0.65
    POSITIVE LOGITS
    uthor
    0.99
    qua
    0.97
    $$
    0.92
    ÄŁ
    0.90
    illac
    0.88
    elta
    0.81
    qa
    0.81
    BIP
    0.80
    q
    0.80
    ibur
    0.78
    Act Density 0.009%

    No Known Activations