INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    the
    1.03
    ו
    0.95
    0.91
    m
    0.86
    ità
    0.85
    '
    0.84
    z
    0.80
    t
    0.76
    و
    0.75
    j
    0.73
    POSITIVE LOGITS
    Nursing
    0.96
     Nursing
    0.95
    Nurse
    0.94
    К
    0.91
    бо
    0.88
     nurses
    0.88
    0.88
    ד
    0.87
    0.86
    0.86
    Act Density 0.003%

    No Known Activations