INDEX
    Explanations

    phrases and structures that express conditions, comparisons, and contrasts

    New Auto-Interp
    Negative Logits
    دانشنامهٔ
    -0.76
    jména
    -0.60
    ykite
    -0.60
     Romains
    -0.59
     eſt
    -0.59
    ArrowToggle
    -0.57
     juſ
    -0.55
     poveznice
    -0.54
     Manns
    -0.53
     universale
    -0.53
    POSITIVE LOGITS
     že
    0.94
     что
    0.90
     що
    0.87
     że
    0.86
     który
    0.86
     ktorý
    0.82
     które
    0.82
     jakie
    0.81
     שח
    0.81
     who
    0.80
    Act Density 0.073%

    No Known Activations