INDEX
    Explanations

    the substring "ab" followed by a single-digit activation value

    occurrences of the abbreviation "ab" in the text

    New Auto-Interp
    Negative Logits
     virtue
    -0.80
     nomine
    -0.74
     Perse
    -0.69
     Coco
    -0.65
    bilt
    -0.65
     Celest
    -0.63
    backer
    -0.61
     Uriel
    -0.60
     Izan
    -0.60
     Patriot
    -0.59
    POSITIVE LOGITS
    stract
    1.34
    bing
    1.11
    yrinth
    1.10
    urger
    1.09
    raham
    1.04
    road
    1.03
    dullah
    1.03
    bed
    1.00
    ecause
    0.98
    riel
    0.96
    Act Density 0.038%

    No Known Activations