INDEX
Explanations
variations of the word "operate" across different contexts
New Auto-Interp
Negative Logits
led
-0.18
olic
-0.18
Fathers
-0.17
boxed
-0.16
e
-0.16
ey
-0.15
ing
-0.15
eper
-0.15
itary
-0.15
eken
-0.14
POSITIVE LOGITS
ational
0.24
аÑĤив
0.22
etta
0.22
atings
0.20
ativ
0.19
ative
0.18
ATIONAL
0.18
ATORS
0.18
atives
0.17
-oper
0.17
Activations Density 0.006%