INDEX
    Explanations

    pronouns followed by actions

    New Auto-Interp
    Negative Logits
    Notably
    1.11
    较为
    1.08
    にて
    1.00
     Notably
    0.98
     এরূপ
    0.97
     poiché
    0.95
     نیز
    0.93
     oldukça
    0.92
     hehe
    0.91
    0.89
    POSITIVE LOGITS
     throw
    0.90
     sitting
    0.88
     wanna
    0.87
     look
    0.85
     throws
    0.84
    াম্প
    0.81
     sit
    0.81
     surround
    0.78
     sat
    0.77
     threw
    0.76
    Act Density 0.078%

    No Known Activations