INDEX
    Explanations

    mathematical expressions, particularly those involving variables, equations, and symbols

    New Auto-Interp
    Negative Logits
     дописавши
    -0.76
     للمعارف
    -0.75
     themſelves
    -0.73
     itſelf
    -0.73
    abestanden
    -0.72
    ſelf
    -0.72
     myſelf
    -0.69
    ſelves
    -0.69
    ---*/
    -0.67
     فريبيس
    -0.67
    POSITIVE LOGITS
    Lire
    0.55
     total
    0.51
     &___
    0.51
     tal
    0.50
    hane
    0.47
     jo
    0.47
    BASELINE
    0.46
    thums
    0.46
     TOTAL
    0.46
     localObject
    0.45
    Act Density 1.497%

    No Known Activations