INDEX
    Explanations

    instances of emotional reactions and discussions related to personal experiences

    New Auto-Interp
    Negative Logits
    acco
    -0.15
    ÑĬ
    -0.15
    udge
    -0.15
    åħĥ
    -0.14
    919
    -0.13
    adiator
    -0.13
    891
    -0.13
    aja
    -0.13
    ilden
    -0.13
    APT
    -0.13
    POSITIVE LOGITS
    بش
    0.17
    eydi
    0.16
    è½
    0.15
    ENCHMARK
    0.14
    ãĤ¤ãĤ¯
    0.14
    .Localization
    0.14
    ìĨ
    0.13
    ž
    0.13
    icut
    0.13
    UZ
    0.13
    Act Density 0.677%

    No Known Activations