INDEX
    Explanations

    proper names and places

    New Auto-Interp
    Negative Logits
    _
    -1.73
    )
    -1.57
    [
    -1.45
    often
    -1.41
    -1.34
    痤疮
    -1.33
     męska
    -1.32
    selben
    -1.31
     [{
    
    -1.30
     sostu
    -1.28
    POSITIVE LOGITS
     from
    2.42
    </h4>
    2.11
     with
    1.94
     because
    1.93
     of
    1.91
     more
    1.71
     since
    1.47
     for
    1.44
     "¿
    1.43
    ној
    1.38
    Act Density 0.564%

    No Known Activations