INDEX
    Explanations

    percentages and numerical data within text

    New Auto-Interp
    Negative Logits
    aub
    -0.17
    loor
    -0.16
    Ø¡
    -0.15
    ibi
    -0.15
     Sab
    -0.14
    achu
    -0.14
    zan
    -0.14
    raq
    -0.14
     Birch
    -0.14
     Bryant
    -0.14
    POSITIVE LOGITS
     agre
    0.16
    ?('
    0.15
    ó
    0.15
    èĵ
    0.14
    pekt
    0.14
    .reporting
    0.14
     Preis
    0.14
     '",
    0.14
    Occurred
    0.14
    ç
    0.13
    Act Density 0.004%

    No Known Activations