INDEX
Explanations
the word "Bon" along with variations of it at varying strengths of activation
the brand name "Bon" in various contexts
New Auto-Interp
Negative Logits
ODE
-0.68
ELD
-0.67
dfx
-0.66
INAL
-0.65
REAM
-0.61
LER
-0.60
Ethics
-0.59
Editorial
-0.58
fluids
-0.58
AY
-0.57
POSITIVE LOGITS
anza
1.32
uses
1.09
kers
1.08
obos
1.04
eless
1.03
nie
1.03
iton
1.03
anz
1.01
bon
1.00
gey
0.99
Activations Density 0.022%