INDEX
Explanations
words related to strange or unconventional characteristics
occurrences of the word "weird" and its variants
New Auto-Interp
Negative Logits
vation
-0.91
ptive
-0.89
Interstitial
-0.89
ailable
-0.84
ptives
-0.83
adr
-0.82
cussion
-0.79
aders
-0.78
FORE
-0.77
idation
-0.77
POSITIVE LOGITS
ness
1.15
ly
1.08
nesses
0.95
ety
0.90
entimes
0.84
est
0.81
oes
0.76
balls
0.75
Weird
0.75
weird
0.74
Activations Density 0.010%