INDEX
    Explanations

    theologian, -ologist, -ological

    New Auto-Interp
    Negative Logits
    c
    1.02
    x
    0.78
    don
    0.74
    kenalkan
    0.70
    d
    0.70
    dır
    0.69
    dW
    0.68
    dalam
    0.67
    dum
    0.63
    dah
    0.63
    POSITIVE LOGITS
    is
    1.02
    م
    0.98
    as
    0.97
    0.96
    ت
    0.94
    ing
    0.91
    to
    0.87
    ic
    0.85
    ig
    0.83
    جي
    0.82
    Act Density 0.001%

    No Known Activations