INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Congrats
    -0.13
     Congratulations
    -0.13
    -0.12
     congratulations
    -0.12
    Congrats
    -0.12
    ,今年
    -0.12
    今年
    -0.12
    😂
    -0.12
    -0.12
     newbie
    -0.12
    POSITIVE LOGITS
     Methods
    0.15
    Methods
    0.14
     Functions
    0.14
    Structures
    0.14
    Characteristics
    0.14
     často
    0.14
    Variants
    0.14
    Types
    0.14
    Definitions
    0.14
     Often
    0.14
    Act Density 0.187%

    No Known Activations