INDEX
    Explanations

    expressions of opinion or judgment about value or worth

    New Auto-Interp
    Negative Logits
     fres
    -0.15
    imet
    -0.14
    engo
    -0.14
    elp
    -0.13
     Casual
    -0.13
    closest
    -0.13
    enci
    -0.13
    preserve
    -0.13
    ariat
    -0.13
    hte
    -0.13
    POSITIVE LOGITS
     meaning
    0.23
     divide
    0.21
     double
    0.20
     triple
    0.19
     splitting
    0.19
     dividing
    0.19
     meanings
    0.18
     mean
    0.18
     splits
    0.18
     facts
    0.18
    Act Density 0.064%

    No Known Activations