INDEX
    Explanations

    phrases that list categories or options

    phrases indicating the classification or categorization of concepts and reasons

    New Auto-Interp
    Negative Logits
    ergy
    -0.74
    Beast
    -0.72
    Ire
    -0.71
    bats
    -0.71
     Andromeda
    -0.68
    uddin
    -0.66
    ebook
    -0.64
    Downloadha
    -0.62
    istant
    -0.62
    lator
    -0.59
    POSITIVE LOGITS
     viz
    1.08
    %:
    1.02
    :-
    1.00
    ):
    0.85
     simultaneously
    0.85
     namely
    0.83
    :
    0.82
    :(
    0.80
    :#
    0.78
    :"
    0.75
    Act Density 0.132%

    No Known Activations