INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     can
    -0.98
    can
    -0.80
     is
    -0.75
     Can
    -0.74
     the
    -0.73
     a
    -0.72
     an
    -0.70
    Can
    -0.68
     its
    -0.67
     has
    -0.66
    POSITIVE LOGITS
     فريبيس
    0.87
     uſe
    0.83
     Мексичка
    0.82
     BoxFit
    0.75
     виправивши
    0.75
     CreateTagHelper
    0.71
    .*")]
    0.71
     ſtill
    0.70
     ſind
    0.69
     Roskov
    0.69
    Act Density 0.870%

    No Known Activations