INDEX
Explanations
phrases indicative of personal relationships and domestic life
New Auto-Interp
Negative Logits
atile
-0.16
prung
-0.15
elijk
-0.14
elijke
-0.14
ellij
-0.14
ucu
-0.14
comed
-0.14
arez
-0.13
atoon
-0.13
ezi
-0.13
POSITIVE LOGITS
sometimes
0.21
sometimes
0.21
variably
0.19
often
0.17
vždy
0.17
ometimes
0.17
Sometimes
0.16
Optimizer
0.16
often
0.16
invariably
0.16
Activations Density 0.096%