INDEX
    Explanations

    statements indicating metrics or measurements of significance

    New Auto-Interp
    Negative Logits
     Drawn
    -0.83
     vulner
    -0.76
     Antar
    -0.74
     Mobil
    -0.68
     sacrific
    -0.66
     Belfast
    -0.66
     Gardens
    -0.64
     Agric
    -0.63
     retirees
    -0.61
    iffs
    -0.61
    POSITIVE LOGITS
    ï¸ı
    1.01
    same
    0.95
    fter
    0.91
    href
    0.90
    ski
    0.90
    Pg
    0.86
    shall
    0.86
    felt
    0.85
    mir
    0.82
    own
    0.80
    Act Density 0.052%

    No Known Activations