INDEX
    Explanations

    names of individuals or contributors in academic contexts

    New Auto-Interp
    Negative Logits
     Japanese
    -0.74
     Japan
    -0.73
     Jap
    -0.65
    Japan
    -0.65
    Japanese
    -0.63
    Tracce
    -0.63
    didSet
    -0.61
     japan
    -0.58
     JAPAN
    -0.58
     Tokyo
    -0.55
    POSITIVE LOGITS
     Take
    0.64
     Oh
    0.60
     Taken
    0.56
     Kit
    0.56
    Take
    0.56
    specialchars
    0.55
    Kit
    0.53
    TAKE
    0.53
    oneofs
    0.53
     Tag
    0.51
    Act Density 0.238%

    No Known Activations