INDEX
Explanations
instances of the word "acting" in various contexts
New Auto-Interp
Negative Logits
xual
-0.70
sbm
-0.69
fer
-0.65
joy
-0.64
roe
-0.64
nic
-0.63
pex
-0.62
gart
-0.62
paralle
-0.61
grown
-0.61
POSITIVE LOGITS
uary
1.02
uate
0.84
Director
0.83
prov
0.81
director
0.77
PRESIDENT
0.74
uated
0.74
OTUS
0.69
adm
0.69
cz
0.69
Activations Density 0.009%