INDEX
    Explanations

    words indicating approximation or estimation

    New Auto-Interp
    Negative Logits
    oli
    -0.68
    ieu
    -0.68
    era
    -0.68
    ves
    -0.65
    ters
    -0.65
    woods
    -0.64
    rive
    -0.64
    rift
    -0.62
     Rebels
    -0.62
    Express
    -0.62
    POSITIVE LOGITS
     analogous
    0.91
     820
    0.86
     WATCHED
    0.85
     800
    0.84
     9000
    0.84
     200
    0.83
     700
    0.81
    Ĥİ
    0.81
    Ń·
    0.81
     equivalent
    0.81
    Act Density 0.031%

    No Known Activations