INDEX
    Explanations

    domain-specific technical or professional terminology—especially named identifiers and camelCase/API-like tokens in code or formal contexts.

    New Auto-Interp
    Negative Logits
    combined
    0.42
     chciał
    0.41
    own
    0.40
     friend
    0.39
    ridine
    0.39
     स्वाभाविक
    0.38
    friend
    0.38
     insulated
    0.38
     arkadaş
    0.38
    想到
    0.38
    POSITIVE LOGITS
    ലൈ
    0.43
    违法
    0.41
     compuls
    0.40
    ர்களை
    0.40
    seur
    0.38
     conscient
    0.37
    ទាំងអស់
    0.37
    ostatistics
    0.37
     monog
    0.36
     conclusiones
    0.36
    Act Density 0.035%

    No Known Activations