INDEX
Explanations
mentions of items that come in pairs
references to "pair" or sets of items
New Auto-Interp
Negative Logits
ulhu
-0.98
inez
-0.71
Causes
-0.69
Interstitial
-0.69
ESCO
-0.64
UGE
-0.63
avez
-0.63
amaz
-0.63
INA
-0.63
ADRA
-0.62
POSITIVE LOGITS
pair
0.96
rings
0.95
ings
0.95
ably
0.93
wise
0.89
pair
0.82
lihood
0.79
pieces
0.79
ring
0.77
paired
0.76
Activations Density 0.018%