INDEX
    Explanations

    phrases related to specific terms and technical jargon

    references to the term "guilty" and its variations

    New Auto-Interp
    Negative Logits
    hips
    -0.85
    marked
    -0.82
    umbledore
    -0.80
    weeney
    -0.79
    ikarp
    -0.78
    killer
    -0.74
    rider
    -0.74
     icing
    -0.72
    fully
    -0.70
    battle
    -0.68
    POSITIVE LOGITS
    ibi
    0.94
    ilty
    0.79
     Netanyahu
    0.78
    acl
    0.77
    ÄŁ
    0.74
    ose
    0.74
    ffer
    0.73
    vernment
    0.70
     Mesh
    0.68
     Hasan
    0.66
    Act Density 0.007%

    No Known Activations