INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     D
    1.14
     L
    1.08
     At
    1.06
     E
    1.06
     N
    1.05
     I
    1.05
     И
    1.03
     C
    1.03
     Y
    1.01
     Do
    1.00
    POSITIVE LOGITS
    the
    1.29
    sthe
    1.21
    mamm
    1.15
    vocabulary
    1.15
    television
    1.14
    cultural
    1.12
    citation
    1.11
    scientist
    1.10
    presentation
    1.10
    customer
    1.09
    Act Density 0.000%

    No Known Activations