INDEX
    Explanations

    expressions related to determination and resilience

    New Auto-Interp
    Negative Logits
    iros
    -0.15
    INE
    -0.15
    ovah
    -0.14
    ighbours
    -0.14
    ua
    -0.14
    hs
    -0.14
    _initializer
    -0.14
    hb
    -0.14
    vez
    -0.14
    ritten
    -0.14
    POSITIVE LOGITS
     ï½ľ
    0.16
    aho
    0.15
     Mi
    0.15
     female
    0.15
     Female
    0.14
    ообÑĢаз
    0.14
    okoj
    0.14
    ãĥĥ
    0.14
    Mi
    0.14
     ops
    0.14
    Act Density 0.006%

    No Known Activations