INDEX
    Explanations

    mentions of historical events or discussions surrounding Jewish identity and persecution

    New Auto-Interp
    Negative Logits
    was
    -0.28
    Was
    -0.25
     wasn
    -0.25
     Was
    -0.24
    _was
    -0.23
     isnt
    -0.22
     was
    -0.21
     isn
    -0.21
     Isn
    -0.18
     conver
    -0.18
    POSITIVE LOGITS
     are
    0.83
     aren
    0.62
    _are
    0.55
     were
    0.54
     ARE
    0.53
     Are
    0.52
    Are
    0.50
    are
    0.48
     são
    0.46
    .are
    0.45
    Act Density 3.793%

    No Known Activations