INDEX
    Explanations

    abstract concepts and specific nouns

    New Auto-Interp
    Negative Logits
    +\
    0.52
    قة
    0.46
    зия
    0.46
    0.45
    жа
    0.44
    UB
    0.44
    HR
    0.43
    цион
    0.43
    ИА
    0.43
    按摩
    0.43
    POSITIVE LOGITS
     Encryption
    0.53
     custom
    0.49
     deciduous
    0.49
     solenoid
    0.48
     Custom
    0.48
     redist
    0.48
     Auction
    0.48
     Redist
    0.47
     surveys
    0.47
     stably
    0.47
    Act Density 0.001%

    No Known Activations