INDEX
    Explanations

    capital cities of countries

    New Auto-Interp
    Negative Logits
    と共に
    0.38
    从而
    0.36
    さんと
    0.36
     거고
    0.36
     туда
    0.35
    所で
    0.35
     그래서
    0.34
     بتوان
    0.34
     যাদের
    0.33
    别人的
    0.33
    POSITIVE LOGITS
     consists
    0.82
     είναι
    0.76
     is
    0.76
     are
    0.71
     differs
    0.67
     jsou
    0.60
     involves
    0.59
     comprises
    0.59
     bestaat
    0.57
     transcends
    0.57
    Act Density 0.090%

    No Known Activations