INDEX
    Explanations

    names starting with "Osa" at different activation levels

    occurrences of the substring "osa" within words

    New Auto-Interp
    Negative Logits
    ERAL
    -0.83
    doms
    -0.80
    rations
    -0.76
    sheet
    -0.74
    rics
    -0.74
    bler
    -0.74
    rary
    -0.74
    rator
    -0.70
    taking
    -0.69
    liest
    -0.68
    POSITIVE LOGITS
    osa
    1.03
     Luxem
    1.02
    qua
    0.94
    que
    0.94
    velength
    0.87
    isy
    0.84
    hea
    0.84
    ña
    0.82
    ques
    0.81
    uce
    0.80
    Act Density 0.016%

    No Known Activations