INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -0.60
     Dodson
    -0.56
    CloseOperation
    -0.49
     확인함
    -0.47
    béco
    -0.46
     ERR
    -0.46
    öt
    -0.46
    indd
    -0.45
    ök
    -0.45
     Efq
    -0.45
    POSITIVE LOGITS
     Media
    2.27
    Media
    2.19
     MEDIA
    1.63
    MEDIA
    1.52
     Medien
    1.28
    Medien
    1.22
     medi
    1.20
     media
    1.18
    medi
    1.14
    media
    1.04
    Act Density 0.005%

    No Known Activations