INDEX
Explanations
mentions of the term "agg" with varying activation strengths
mentions of "agg" or related terms emphasizing aggregation or gathering concepts
New Auto-Interp
Negative Logits
Defenders
-0.71
graphene
-0.67
phrine
-0.66
birth
-0.64
passage
-0.63
Gemini
-0.63
©¶æ
-0.62
conditional
-0.61
candle
-0.60
)].
-0.58
POSITIVE LOGITS
agg
1.08
regate
1.02
iott
1.00
idy
0.90
rieved
0.90
ernaut
0.84
irlf
0.84
regation
0.84
abba
0.82
////
0.82
Activations Density 0.007%