INDEX
Explanations
primarily focus on predictions
New Auto-Interp
Negative Logits
drugih
0.90
etiam
0.85
parfois
0.85
senere
0.83
によっては
0.82
orems
0.82
jiné
0.81
بعض
0.80
mengapa
0.80
也能
0.79
POSITIVE LOGITS
most
1.21
Mostly
1.10
primarily
1.10
Most
1.08
Most
1.08
opted
1.06
MOST
1.05
most
1.05
Primarily
1.05
mainly
1.01
Activations Density 0.286%