INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ,
    1.19
    ،
    0.98
    (),
    0.97
    #,
    0.91
    ,“
    0.87
    ,*
    0.87
    ),
    0.85
    As
    0.84
    0.83
    A
    0.82
    POSITIVE LOGITS
     but
    1.55
     although
    1.40
     though
    1.39
     लेकिन
    1.33
     huh
    1.32
     haha
    1.32
     albeit
    1.28
     yeah
    1.26
     unless
    1.24
     preferably
    1.24
    Act Density 0.699%

    No Known Activations