INDEX
    Explanations

    phrases that emphasize significance and highlight positive attributes or features

    New Auto-Interp
    Negative Logits
    fvar
    -0.58
    SPATH
    -0.56
     Obrador
    -0.55
    ItemBackground
    -0.54
    تقاوى
    -0.53
    WireFormatLite
    -0.53
     esterni
    -0.51
    gamot
    -0.50
     виправивши
    -0.49
    :✨
    -0.49
    POSITIVE LOGITS
     soprattutto
    0.69
     surtout
    0.68
    何より
    0.68
    ScopeManager
    0.62
    Ultimately
    0.61
     vooral
    0.60
    AxisAlignment
    0.59
     nhất
    0.56
    twimg
    0.55
     Ultimately
    0.55
    Act Density 0.281%

    No Known Activations