INDEX
Explanations
mentions of geographic locations, towns, and people's names
specific names and notable places
New Auto-Interp
Negative Logits
dracon
-0.74
envy
-0.63
etheless
-0.62
jriwal
-0.62
trope
-0.58
fallacy
-0.57
doom
-0.56
transitioning
-0.56
priceless
-0.56
puzz
-0.55
POSITIVE LOGITS
oz
0.85
ito
0.82
ich
0.79
ak
0.79
ani
0.78
ona
0.78
jan
0.77
oli
0.77
ema
0.77
aj
0.76
Activations Density 0.511%