INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    aturen
    0.41
    ША
    0.40
    Hannah
    0.40
    HSL
    0.40
    クル
    0.37
    HERS
    0.37
    Lamp
    0.37
    0.37
    hud
    0.37
    Scha
    0.37
    POSITIVE LOGITS
     Detective
    0.45
     detective
    0.44
     geval
    0.39
    羿
    0.39
     sacc
    0.38
    必定
    0.38
     beh
    0.38
     o
    0.37
     backtrack
    0.37
     必ず
    0.37
    Act Density 0.001%

    No Known Activations