INDEX
    Explanations

    quantifying comparisons or language features

    New Auto-Interp
    Negative Logits
     paths
    0.41
     pathways
    0.40
     biofuel
    0.40
     websites
    0.39
     FAQs
    0.38
     BL
    0.38
    Paths
    0.38
     grassroots
    0.38
     gained
    0.37
     downloads
    0.37
    POSITIVE LOGITS
    0.52
     grandiose
    0.50
    简直
    0.47
    ...!
    0.47
     иметь
    0.47
     मरम्मत
    0.46
     바꾸
    0.46
    Siehe
    0.46
     despot
    0.45
     सोबत
    0.45
    Act Density 0.012%

    No Known Activations