INDEX
    Explanations

    references to treatment methods and their effectiveness

    New Auto-Interp
    Negative Logits
     bÃło
    -0.17
    ãĥĩãĥ«
    -0.15
    yük
    -0.14
     arous
    -0.14
    caffe
    -0.14
    uddle
    -0.14
    podob
    -0.14
    .newBuilder
    -0.14
    Reminder
    -0.14
    ipop
    -0.13
    POSITIVE LOGITS
     improvement
    0.30
     improved
    0.27
     improvements
    0.26
     Improvement
    0.26
     Improved
    0.25
    Improved
    0.22
     improve
    0.22
     Dram
    0.21
     transformation
    0.20
     dramatic
    0.19
    Act Density 0.264%

    No Known Activations