INDEX
    Explanations

    Russian and Gestalt texts

    New Auto-Interp
    Negative Logits
     erosion
    -0.08
    Whit
    -0.08
     rooft
    -0.08
     targets
    -0.08
    flora
    -0.08
    onne
    -0.07
    hip
    -0.07
     richtet
    -0.07
     childbirth
    -0.07
     adjusting
    -0.07
    POSITIVE LOGITS
     Minis
    0.08
     Did
    0.08
     courtroom
    0.08
     bluff
    0.07
    积分
    0.07
    0.07
     للتح
    0.07
     zav
    0.07
     বের
    0.07
     العن
    0.07
    Act Density 0.001%

    No Known Activations