INDEX
Explanations
phrases that indicate contrasting circumstances or conditions
New Auto-Interp
Negative Logits
'
-0.61
Chartres
-0.56
יצד
-0.56
WA
-0.56
minerals
-0.53
ћа
-0.53
lika
-0.53
Skinner
-0.49
Torino
-0.49
hassee
-0.48
POSITIVE LOGITS
ostante
1.74
despite
1.48
Despite
1.40
despite
1.35
Trotz
1.35
Despite
1.34
nonostante
1.30
Trotz
1.29
Malgré
1.29
spite
1.28
Activations Density 0.085%