INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cheese
    -0.93
     potato
    -0.85
     sard
    -0.84
     Phòng
    -0.84
     Ар
    -0.84
     tomato
    -0.84
     medieval
    -0.83
    申し込み
    -0.82
     butter
    -0.82
    尔夫
    -0.81
    POSITIVE LOGITS
     Chinese
    1.32
     Asian
    1.19
     China
    1.13
     китай
    1.12
    Chinese
    1.09
    China
    0.99
     бам
    0.98
     cinese
    0.97
    Asian
    0.92
     wok
    0.91
    Act Density 0.016%

    No Known Activations