INDEX
    Explanations

    characters with a special symbol before and after their name

    instances of the character "Ŀ" in the text

    New Auto-Interp
    Negative Logits
     disadvant
    -0.84
     warr
    -0.83
     ende
    -0.78
     psychiat
    -0.72
     incorpor
    -0.72
     secretaries
    -0.71
     perspect
    -0.70
     answ
    -0.70
     unemploy
    -0.69
     misunder
    -0.69
    POSITIVE LOGITS
    ï¸ı
    1.02
    °
    0.93
    é¾į
    0.86
    âĻ
    0.84
    ÃĽ
    0.83
    º
    0.81
    âĶĢâĶĢ
    0.81
     âĢº
    0.77
    âĻ¥
    0.76
    ï¸
    0.74
    Act Density 0.123%

    No Known Activations