INDEX
    Explanations

    lists of items, such as countries, representatives, or movies

    phrases indicating lists

    New Auto-Interp
    Negative Logits
    ester
    -0.72
    entimes
    -0.72
    imet
    -0.71
    gypt
    -0.69
    iva
    -0.67
    imeter
    -0.65
     breeze
    -0.64
    athy
    -0.64
    lycer
    -0.63
    ashtra
    -0.62
    POSITIVE LOGITS
     sorts
    0.89
     entries
    0.85
     accomplishments
    0.79
    ãĥ¼ãĥ³
    0.76
     items
    0.75
    Topics
    0.74
     ingredients
    0.72
     celebrities
    0.71
     keywords
    0.71
     grievances
    0.70
    Act Density 0.107%

    No Known Activations