INDEX
Explanations
This neuron is looking for words related to emphasis on specific elements or comparisons
frequently mentioned subjects and their attributes
New Auto-Interp
Negative Logits
£ı
-0.89
Ń·
-0.86
nsic
-0.82
swick
-0.80
«ĺ
-0.76
allery
-0.74
auc
-0.74
CrossRef
-0.72
isky
-0.72
LEASE
-0.70
POSITIVE LOGITS
Ascension
0.63
breaths
0.60
Summers
0.59
Sandwich
0.59
wings
0.58
weap
0.58
awaited
0.58
compos
0.58
athe
0.58
seeker
0.57
Activations Density 0.121%