INDEX
    Explanations

    questions and commands

    questions and inquiries starting with "What" or "Why."

    New Auto-Interp
    Negative Logits
    Iv
    -0.78
    tin
    -0.76
    fm
    -0.73
    76561
    -0.69
    boat
    -0.68
    tur
    -0.68
    åIJ
    -0.68
    nin
    -0.68
    loading
    -0.67
    cffffcc
    -0.66
    POSITIVE LOGITS
    soever
    0.92
     Makes
    0.83
     Lies
    0.83
     distinguishes
    0.79
     separates
    0.78
     Choose
    0.75
     Definitions
    0.74
     Emails
    0.73
     Facts
    0.73
     Changes
    0.71
    Act Density 0.091%

    No Known Activations