INDEX
    Explanations

    specific numerical information, such as quantities or rankings

    New Auto-Interp
    Negative Logits
    utics
    -0.82
    each
    -0.72
    strength
    -0.71
    their
    -0.71
    allah
    -0.68
    rey
    -0.68
    lag
    -0.68
    terness
    -0.68
    erved
    -0.68
    apt
    -0.67
    POSITIVE LOGITS
     casualty
    1.17
     thing
    1.11
     reason
    0.94
     beneficiary
    0.93
     major
    0.91
     obstacle
    0.88
     culprit
    0.87
     installment
    0.86
     piece
    0.86
     exception
    0.85
    Act Density 1.390%

    No Known Activations