INDEX
    Explanations

    question words in different languages

    New Auto-Interp
    Negative Logits
     ইহাই
    0.52
     inbuilt
    0.48
     incul
    0.45
     exists
    0.43
     india
    0.43
     intrinsic
    0.43
     inherent
    0.41
     latest
    0.41
    agic
    0.41
    Это
    0.40
    POSITIVE LOGITS
     யார்
    0.58
     nasıl
    0.57
     어떻게
    0.57
    0.57
    0.57
     хто
    0.57
     когда
    0.54
     ktoś
    0.54
     όταν
    0.54
     кто
    0.54
    Act Density 0.004%

    No Known Activations