INDEX
Explanations
social media handles with a specific symbol
instances of a specific character or symbol in various contexts
New Auto-Interp
Negative Logits
ordinate
-0.82
perate
-0.79
compens
-0.76
interstitial
-0.75
wagen
-0.70
worms
-0.70
ractor
-0.70
anes
-0.70
annex
-0.68
ciplinary
-0.68
POSITIVE LOGITS
————————
1.16
VIDEOS
1.09
————
1.04
————————————————
0.89
Kimber
0.80
Rabbi
0.74
Jonah
0.74
Jem
0.74
Ùħ
0.73
avanaugh
0.72
Activations Density 0.061%