INDEX
    Explanations

    relationships and interactions between different individuals

    New Auto-Interp
    Negative Logits
     mef
    -1.45
     dises
    -1.41
     haup
    -1.33
     umo
    -1.33
     gonz
    -1.33
     seiz
    -1.33
    fordable
    -1.31
     canel
    -1.30
     hcm
    -1.28
     abnorm
    -1.28
    POSITIVE LOGITS
     He
    0.91
     She
    0.83
     They
    0.82
     Thus
    0.78
     His
    0.77
     So
    0.76
     Therefore
    0.75
    <eos>
    0.75
    ↵↵
    0.74
     But
    0.74
    Act Density 0.500%

    No Known Activations