INDEX
Explanations
words related to imitation or replication
words related to imitation and mimicking behaviors
New Auto-Interp
Negative Logits
recall
-0.66
Kore
-0.66
orage
-0.65
IFE
-0.61
Cort
-0.59
spring
-0.58
awakening
-0.58
Ads
-0.57
Sahara
-0.57
orer
-0.57
POSITIVE LOGITS
bley
1.21
pered
1.20
neys
1.18
kered
1.14
ped
1.09
ply
1.07
bered
1.06
ney
1.04
icked
1.03
etr
1.01
Activations Density 0.116%