INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Guerre
    0.55
     guerre
    0.50
     dinh
    0.50
     wohn
    0.47
     Griechen
    0.47
     Früh
    0.47
     miền
    0.45
     युद्ध
    0.44
     Тро
    0.44
     tinggal
    0.44
    POSITIVE LOGITS
    t
    0.54
     adhipp
    0.43
     autom
    0.43
     telep
    0.43
     allevi
    0.42
     postdoc
    0.42
    ە
    0.42
     anthocyan
    0.42
    ₂,
    0.41
    uvad
    0.41
    Act Density 0.005%

    No Known Activations