INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    -1.93
     habrían
    -1.88
     podían
    -1.80
     comenzaron
    -1.73
     empezaron
    -1.73
     querían
    -1.71
     tienden
    -1.69
     pueden
    -1.66
     tenían
    -1.63
    -1.61
    POSITIVE LOGITS
    0
    1.78
     shows
    1.66
     with
    1.65
     does
    1.50
    着一个
    1.49
     did
    1.47
     can
    1.45
    并非
    1.41
    って思
    1.38
     performed
    1.34
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.