INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    k
    0.44
     vang
    0.43
    资源的
    0.42
     ضرور
    0.41
     appell
    0.41
     properties
    0.40
     propriedades
    0.40
    n
    0.40
     risorse
    0.40
     propriétés
    0.39
    POSITIVE LOGITS
     Букови
    0.50
    Behaviours
    0.49
    Dg
    0.47
    0.46
    Tokens
    0.46
    𝐃
    0.46
    ţi
    0.46
    точ
    0.46
    Periph
    0.46
    Buk
    0.46
    Act Density 0.001%

    No Known Activations