INDEX
    Explanations

    rotation, rotate, rotated

    New Auto-Interp
    Negative Logits
    са
    1.51
    ри
    1.41
    ро
    1.36
    то
    1.24
    ř
    1.24
    се
    1.23
    ד
    1.22
    е
    1.21
    ጨማሪ
    1.20
    ן
    1.20
    POSITIVE LOGITS
    s
    1.88
    ્સ
    1.77
    不已
    1.70
    i
    1.65
    ه
    1.55
    𝘴
    1.52
    dır
    1.46
    الى
    1.46
    citizens
    1.46
    ścic
    1.46
    Act Density 0.166%

    No Known Activations