INDEX
    Explanations

    references to clothing, particularly shirts

    New Auto-Interp
    Negative Logits
    Tikang
    -0.62
    tvguidetime
    -0.52
     mods
    -0.50
    TestingModule
    -0.49
     transfieras
    -0.49
     nakalista
    -0.48
    mods
    -0.47
    AndEndTag
    -0.47
    Morfologia
    -0.46
    plaque
    -0.46
    POSITIVE LOGITS
     premises
    0.76
     shirt
    0.74
     shirts
    0.67
    Shirt
    0.59
     Edward
    0.57
    shirt
    0.57
     Shirt
    0.56
    Shirts
    0.56
    Edward
    0.55
    premises
    0.53
    Act Density 0.067%

    No Known Activations