INDEX
    Explanations

    pronoun + is/apostrophe

    New Auto-Interp
    Negative Logits
     unaware
    0.53
     unable
    0.51
     aware
    0.49
    ただし
    0.49
    ることができる
    0.48
     associated
    0.46
     preventing
    0.46
     incap
    0.46
     procedural
    0.46
     determining
    0.45
    POSITIVE LOGITS
     है
    0.77
     deserves
    0.75
     είναι
    0.73
     выглядит
    0.71
     is
    0.69
     është
    0.69
    简直
    0.68
     är
    0.67
    是一个
    0.66
     seems
    0.66
    Act Density 0.001%

    No Known Activations