INDEX
    Explanations

    terms and concepts related to philosophy

    New Auto-Interp
    Negative Logits
    <bos>
    -2.64
    AsUp
    -0.74
    writeField
    -0.71
    -0.69
     convene
    -0.67
     inaugurate
    -0.65
    /***
    
    -0.65
     avert
    -0.62
     conserve
    -0.62
    IContainer
    -0.61
    POSITIVE LOGITS
     affor
    1.26
     increa
    1.21
     thut
    1.14
     wien
    1.13
     bandung
    1.12
     Juf
    1.11
     tew
    1.10
     yong
    1.10
     yoo
    1.10
     FFFF
    1.08
    Act Density 0.034%

    No Known Activations