INDEX
    Explanations

    mentions of geographic locations, towns, and people's names

    specific names and notable places

    New Auto-Interp
    Negative Logits
     dracon
    -0.74
     envy
    -0.63
    etheless
    -0.62
    jriwal
    -0.62
     trope
    -0.58
     fallacy
    -0.57
     doom
    -0.56
     transitioning
    -0.56
     priceless
    -0.56
     puzz
    -0.55
    POSITIVE LOGITS
    oz
    0.85
    ito
    0.82
    ich
    0.79
    ak
    0.79
    ani
    0.78
    ona
    0.78
    jan
    0.77
    oli
    0.77
    ema
    0.77
    aj
    0.76
    Act Density 0.511%

    No Known Activations