INDEX
    Explanations

    describing female relatives

    New Auto-Interp
    Negative Logits
    .
    0.86
    ra
    0.61
    ل
    0.60
    sekut
    0.55
    tty
    0.55
    ла
    0.54
    them
    0.54
    ter
    0.54
    ta
    0.53
    an
    0.52
    POSITIVE LOGITS
    0.64
    <0x0D>
    0.64
     denounced
    0.61
     perpetually
    0.60
     stunned
    0.57
     harrowing
    0.56
     denounce
    0.55
    க்கு
    0.54
     disgusted
    0.54
     беременности
    0.53
    Act Density 0.009%

    No Known Activations