INDEX
    Explanations

    concepts related to misconceptions and fallacies in reasoning

    New Auto-Interp
    Negative Logits
     purpoſe
    -0.77
     متعلقه
    -0.76
    Diwedd
    -0.73
     houſe
    -0.72
     itſelf
    -0.69
    ſelf
    -0.68
     Jefus
    -0.67
     uſe
    -0.66
    حياتها
    -0.66
    ſelves
    -0.66
    POSITIVE LOGITS
     misunder
    0.59
     often
    0.59
     wrongly
    0.54
     wrong
    0.54
     ignor
    0.53
     Often
    0.53
     mistaken
    0.53
     misunderstand
    0.51
     sometimes
    0.50
     incorrect
    0.50
    Act Density 0.454%

    No Known Activations