INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    erator
    -0.18
    trys
    -0.17
    sport
    -0.17
    estatus
    -0.17
    aries
    -0.17
    orners
    -0.16
    experimental
    -0.16
     sport
    -0.16
    tf
    -0.15
    eus
    -0.15
    POSITIVE LOGITS
    manship
    0.40
    men
    0.32
    man
    0.31
    mans
    0.30
     Illustrated
    0.29
    people
    0.28
    books
    0.28
    medicine
    0.28
    book
    0.27
    heets
    0.27
    Act Density 0.021%

    No Known Activations