INDEX
Explanations
proposed actions or consequences
New Auto-Interp
Negative Logits
sometimes
-1.25
proib
-1.09
sometimes
-1.07
nějak
-1.07
値下げ
-1.05
occasionally
-1.05
ってもら
-1.05
courteous
-1.02
そろそろ
-0.98
aidé
-0.98
POSITIVE LOGITS
proposed
1.35
will
1.11
implications
1.10
threatens
1.07
Implications
1.07
ohne
1.02
danger
1.01
proposal
1.00
intention
0.97
plan
0.96
Activations Density 0.073%