INDEX
    Explanations

    understanding, empathy, and support

    New Auto-Interp
    Negative Logits
    N
    0.57
    kval
    0.55
    í
    0.52
    ene
    0.51
    h
    0.51
    io
    0.50
    ı
    0.50
    ه‌ای
    0.48
    های
    0.48
    小于
    0.48
    POSITIVE LOGITS
    ר
    0.55
     amulet
    0.54
     coexist
    0.53
     place
    0.52
     divine
    0.51
     carotid
    0.49
     apathy
    0.48
     Mén
    0.46
     depot
    0.45
     curtail
    0.45
    Act Density 0.083%

    No Known Activations