INDEX
    Explanations

    references to specific instances or occurrences

    New Auto-Interp
    Negative Logits
     on
    -0.17
    egral
    -0.15
    iê
    -0.14
    ildi
    -0.14
    èħ
    -0.14
    å¼ı
    -0.14
    forcements
    -0.13
    onaut
    -0.13
    äge
    -0.13
    cak
    -0.13
    POSITIVE LOGITS
     behalf
    0.51
     occasions
    0.39
     occasion
    0.39
     basis
    0.36
    basis
    0.32
    occasion
    0.31
     grounds
    0.28
    _basis
    0.24
     Basis
    0.23
     dime
    0.21
    Act Density 0.832%

    No Known Activations