INDEX
    Explanations

    age followed by person descriptor

    New Auto-Interp
    Negative Logits
    ה
    1.10
    ه
    0.99
    a
    0.93
    נן
    0.82
    u
    0.81
    0.80
    ール
    0.79
    תן
    0.79
     sumptuous
    0.77
    aing
    0.77
    POSITIVE LOGITS
    ش
    1.11
    "
    1.07
    ED
    0.92
    ri
    0.86
    ik
    0.86
    ut
    0.84
    är
    0.84
     be
    0.83
    ni
    0.82
    res
    0.81
    Act Density 0.006%

    No Known Activations