INDEX
Explanations
expressions of contrast or contradiction
New Auto-Interp
Negative Logits
rets
-0.16
joy
-0.15
material
-0.15
jie
-0.15
UIS
-0.15
oi
-0.14
oo
-0.14
reesome
-0.14
oom
-0.14
je
-0.14
POSITIVE LOGITS
contrary
0.49
opposite
0.40
contrario
0.38
naopak
0.36
quite
0.35
Quite
0.33
åıį
0.33
ngược
0.29
quite
0.29
Au
0.27
Activations Density 0.127%