INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    a
    0.97
    ari
    0.90
    0.87
    z
    0.86
    us
    0.79
    ai
    0.79
    ia
    0.78
    are
    0.78
    ee
    0.73
    i
    0.73
    POSITIVE LOGITS
     театра
    0.93
    Drama
    0.93
     Theatre
    0.88
     теат
    0.82
     театр
    0.82
     Dram
    0.80
    🎭
    0.80
     नाट्य
    0.80
    0.80
     théâtre
    0.79
    Act Density 0.027%

    No Known Activations