INDEX
    Explanations

    advocated, advocating, voluntarily, persuade

    New Auto-Interp
    Negative Logits
     đẳng
    0.47
    🏦
    0.44
    中部
    0.44
    玫瑰
    0.43
    0.43
    आई
    0.43
    ائز
    0.43
     탱크
    0.42
     উত্তর
    0.41
     সূত্রে
    0.41
    POSITIVE LOGITS
     advocated
    0.47
     advocating
    0.43
     თავ
    0.42
    refresh
    0.39
     pledge
    0.39
     ulc
    0.38
     merely
    0.38
     voluntarily
    0.38
     persuade
    0.38
     persuaded
    0.37
    Act Density 0.008%

    No Known Activations