INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ドライブ
    -1.05
    driving
    -1.02
     contatto
    -1.01
     Driving
    -1.00
     driving
    -0.96
    rscheinlich
    -0.96
     sailing
    -0.95
     Drive
    -0.94
     Sailing
    -0.93
     kring
    -0.93
    POSITIVE LOGITS
     walk
    2.11
     walking
    2.00
     walked
    1.81
     walks
    1.55
    Walking
    1.55
    walk
    1.52
    เดิน
    1.42
    walking
    1.39
    Walk
    1.38
     Walking
    1.32
    Act Density 0.012%

    No Known Activations