INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     fabricated
    -0.66
    ascript
    -0.65
     facade
    -0.62
    76561
    -0.62
     sampling
    -0.62
     Parables
    -0.61
     scare
    -0.61
     plaque
    -0.61
    vertisement
    -0.60
    PIN
    -0.59
    POSITIVE LOGITS
    etc
    0.98
     TOTAL
    0.84
    Others
    0.78
     Flavoring
    0.77
     Lastly
    0.73
     Finally
    0.72
     Conclusion
    0.71
    Conclusion
    0.71
     âĶľ
    0.70
    rador
    0.69
    Act Density 0.157%

    No Known Activations