INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Girls
    -0.08
     taas
    -0.08
     ISP
    -0.07
     girls
    -0.07
    ロン
    -0.07
    blr
    -0.07
     synthetic
    -0.07
    Tan
    -0.07
     وڃي
    -0.07
    tan
    -0.07
    POSITIVE LOGITS
    уын
    0.08
     imagin
    0.08
    Emotion
    0.08
     emotions
    0.08
    Intensity
    0.07
     эмоцион
    0.07
     분위
    0.07
     Emotion
    0.07
     emotion
    0.07
    odon
    0.07
    Act Density 0.004%

    No Known Activations