INDEX
    Explanations

    words with 'Mon' prefix

    New Auto-Interp
    Negative Logits
    in
    0.64
    alary
    0.52
    𝐦
    0.47
    inį
    0.46
    andro
    0.45
    inizi
    0.43
    atively
    0.42
    embers
    0.42
    inati
    0.41
    علم
    0.41
    POSITIVE LOGITS
     MON
    0.98
     mon
    0.96
     Mon
    0.88
     mono
    0.77
     моно
    0.76
    Mon
    0.75
     monop
    0.75
     モン
    0.70
     monoc
    0.69
     Mono
    0.66
    Act Density 0.021%

    No Known Activations