INDEX
Explanations
your body or personal space
New Auto-Interp
Negative Logits
wollten
1.04
waren
1.03
erano
1.01
تھیں
1.01
Were
0.96
Were
0.96
olid
0.94
говорили
0.93
использовали
0.93
wollte
0.92
POSITIVE LOGITS
enjoys
1.40
thrives
1.37
operates
1.08
undergoes
1.07
receives
1.03
interacts
1.02
occupies
1.02
enjoy
0.99
manages
0.98
scarcely
0.96
Activations Density 0.000%