INDEX
    Explanations

    phrases indicating familiarity or recognition

    references to familiarity with concepts or objects

    New Auto-Interp
    Negative Logits
    chance
    -0.79
    Chance
    -0.72
    scoring
    -0.68
    hap
    -0.66
    mpeg
    -0.65
    oner
    -0.64
    tan
    -0.64
    rate
    -0.64
    reme
    -0.63
    hemy
    -0.63
    POSITIVE LOGITS
     familiar
    1.05
    iliar
    0.92
    isable
    0.88
     faces
    0.81
    igan
    0.81
    idad
    0.80
     recognizable
    0.79
    iable
    0.77
     enough
    0.77
     unfamiliar
    0.76
    Act Density 0.012%

    No Known Activations