INDEX
    Explanations

    phrases related to medical treatments and their efficacy

    New Auto-Interp
    Negative Logits
     inoc
    -0.14
    .fa
    -0.14
    anje
    -0.14
     Lawson
    -0.13
    _recursive
    -0.13
    anse
    -0.13
    -animate
    -0.13
    ÃŃna
    -0.13
    537
    -0.13
     "
    -0.13
    POSITIVE LOGITS
    æĥ
    0.18
    abay
    0.17
    ilim
    0.16
    ntag
    0.16
    -device
    0.16
    lernen
    0.15
    aylight
    0.14
    ,ev
    0.14
    /flutter
    0.14
     toler
    0.14
    Act Density 0.027%

    No Known Activations