INDEX
Explanations
sentences related to changes in circumstances or situations
New Auto-Interp
Negative Logits
of
-0.50
ням
-0.46
pob
-0.46
Replacing
-0.45
bort
-0.45
ium
-0.45
[
-0.44
たら
-0.44
,
-0.44
-0.43
POSITIVE LOGITS
things
1.15
Things
1.12
things
1.06
$_"
1.06
Things
1.05
ſelf
1.04
surla
1.04
THINGS
1.03
THINGS
0.99
itſelf
0.91
Activations Density 0.140%