INDEX
    Explanations

    noun categories and descriptors

    New Auto-Interp
    Negative Logits
    𝗶
    0.44
    -">
    0.40
    𝗬
    0.40
    𝙄
    0.37
     disqualify
    0.37
    𝗗
    0.37
    이를
    0.36
    𝗧
    0.36
    BOOL
    0.35
    িগুণ
    0.35
    POSITIVE LOGITS
     extraordinaire
    0.42
    ாம்
    0.39
    celona
    0.37
    ρεία
    0.35
     simbolo
    0.34
     llamadas
    0.34
    0.33
     Ern
    0.32
    chanics
    0.32
    ское
    0.32
    Act Density 0.213%

    No Known Activations