INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     متعلقه
    -0.73
     חיצוניים
    -0.66
    herself
    -0.65
    Personendaten
    -0.64
     MainAxisSize
    -0.62
     يتيمه
    -0.62
    principalColumn
    -0.59
    发表于
    -0.58
    SOUNDBITE
    -0.57
    RegressionTest
    -0.57
    POSITIVE LOGITS
     there
    0.77
     less
    0.75
     more
    0.73
     leſs
    0.59
     fewer
    0.56
     menos
    0.52
     LESS
    0.52
     you
    0.51
    menos
    0.50
     lefs
    0.50
    Act Density 0.001%

    No Known Activations