INDEX
Explanations
instances of the word "dod" or related forms, typically indicating avoidance or evasion
New Auto-Interp
Negative Logits
auen
-0.18
Garrison
-0.17
Gauge
-0.15
ophon
-0.15
asser
-0.15
Sher
-0.14
ullan
-0.14
lessly
-0.14
/bg
-0.14
ÙĥÙĬÙĬÙģ
-0.14
POSITIVE LOGITS
ging
0.53
ges
0.49
ged
0.46
gem
0.38
ger
0.37
gy
0.34
gew
0.34
GING
0.33
gers
0.32
gement
0.30
Activations Density 0.013%