INDEX
    Explanations

    coal, oil, gold, honey, coffee

    New Auto-Interp
    Negative Logits
    ہ
    0.93
    ین
    0.83
    л
    0.71
    وک
    0.71
    ف
    0.70
    ز
    0.67
    ک
    0.63
    in
    0.61
     наук
    0.61
     друга
    0.60
    POSITIVE LOGITS
     that
    0.83
    K
    0.79
    I
    0.75
    X
    0.70
    AY
    0.69
    F
    0.68
     you
    0.65
    J
    0.65
    O
    0.64
    N
    0.64
    Act Density 0.398%

    No Known Activations