INDEX
    Explanations

    references to quantifiable metrics and data in various contexts

    New Auto-Interp
    Negative Logits
     ainfi
    -0.87
    PreferredItem
    -0.80
     مشين
    -0.80
     myſelf
    -0.78
    TintMode
    -0.77
     Dili
    -0.76
     Diretto
    -0.75
    Puis
    -0.74
     therefrom
    -0.74
     (\<
    -0.74
    POSITIVE LOGITS
    0.76
     has
    0.61
     or
    0.59
     had
    0.55
     A
    0.53
     have
    0.52
     Has
    0.52
     and
    0.50
     punya
    0.49
     very
    0.48
    Act Density 0.707%

    No Known Activations