INDEX
    Explanations

    titles and people's roles

    New Auto-Interp
    Negative Logits
     tyranny
    0.42
     하나의
    0.41
    lavery
    0.40
     करणे
    0.40
    ักษณะ
    0.39
    utorial
    0.38
     కొన్ని
    0.38
    ledning
    0.38
    0.37
    gebras
    0.37
    POSITIVE LOGITS
     photographer
    0.64
     philanthropist
    0.58
     astronomer
    0.56
     poet
    0.55
     physicist
    0.54
     professor
    0.54
     Professor
    0.53
     columnist
    0.53
     psychologist
    0.53
     filmmaker
    0.53
    Act Density 0.038%

    No Known Activations