INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    j
    0.62
    ба
    0.61
    f
    0.59
    ш
    0.57
    NW
    0.55
    𝓬
    0.55
    но
    0.55
     که
    0.54
    v
    0.53
    ای
    0.52
    POSITIVE LOGITS
    ون
    0.61
     installations
    0.60
     construction
    0.59
     constructions
    0.58
     architecture
    0.53
    '
    0.52
     बांधकाम
    0.50
     invigorating
    0.49
    ude
    0.49
     for
    0.49
    Act Density 0.078%

    No Known Activations