INDEX
    Explanations

    сто, столе, student

    New Auto-Interp
    Negative Logits
    ۰
    1.34
    mi
    1.28
    the
    1.28
    ri
    1.13
    self
    1.09
    aría
    1.02
    ar
    1.02
    ni
    1.02
    raded
    1.02
    sc
    1.01
    POSITIVE LOGITS
     on
    1.66
    ت
    1.52
    С
    1.48
    ↵↵
    1.43
    1.43
    ك
    1.42
    "
    1.33
    У
    1.30
    ات
    1.23
    కు
    1.23
    Act Density 0.000%

    No Known Activations