INDEX
    Explanations

    phrases related to being a first in a sequence or category

    mentions of "first" in various contexts

    New Auto-Interp
    Negative Logits
     Machines
    -0.74
     Twins
    -0.73
     lobb
    -0.71
    assies
    -0.70
     Dru
    -0.70
     embassies
    -0.67
     Vital
    -0.66
     Canaver
    -0.65
     visuals
    -0.64
     segments
    -0.64
    POSITIVE LOGITS
    achable
    0.71
    cially
    0.70
    taboola
    0.69
    clamation
    0.69
    ailable
    0.68
    statement
    0.67
     contender
    0.66
    inkle
    0.65
     apiece
    0.64
     berth
    0.64
    Act Density 0.147%

    No Known Activations