INDEX
    Explanations

    verbs followed by content

    New Auto-Interp
    Negative Logits
     of
    -2.20
    他の
    -1.47
    idées
    -1.42
     before
    -1.38
     there
    -1.36
    できた
    -1.36
    時には
    -1.32
    終わった
    -1.31
     olduk
    -1.30
     also
    -1.29
    POSITIVE LOGITS
     his
    1.48
    [])
    
    1.38
     僕
    1.36
    1.34
    1.32
     lequel
    1.30
     แต่
    1.30
    1.28
     tivesse
    1.25
     Which
    1.24
    Act Density 0.072%

    No Known Activations