INDEX
    Explanations

    references to thresholds and related measurements

    New Auto-Interp
    Negative Logits
     μην
    -0.18
     thorough
    -0.17
    iform
    -0.16
     Jarvis
    -0.15
       
    -0.15
    pend
    -0.15
    tures
    -0.15
    ivity
    -0.15
    ween
    -0.15
    tings
    -0.15
    POSITIVE LOGITS
    .Tasks
    0.25
    ursday
    0.20
     Nhĩ
    0.19
    apeutic
    0.19
    reesome
    0.17
    ompson
    0.17
    bolt
    0.17
    istle
    0.16
    sgiving
    0.16
    puts
    0.16
    Act Density 0.184%

    No Known Activations