INDEX
Explanations
terms and phrases indicating contrast or opposition
New Auto-Interp
Negative Logits
سطس
-0.66
anskje
-0.59
ñones
-0.59
referenties
-0.57
slee
-0.57
Mete
-0.55
}=[
-0.54
selanjutnya
-0.54
swal
-0.54
makl
-0.53
POSITIVE LOGITS
reverses
1.11
Reverse
1.05
reverse
1.03
inversion
1.02
opposes
1.01
Opposite
1.00
逆
0.99
reversal
0.97
Inverse
0.97
inver
0.96
Activations Density 0.242%