INDEX
Explanations
under specific conditions or supervision
New Auto-Interp
Negative Logits
ى
0.87
m
0.79
to
0.66
postérieures
0.61
на
0.61
сез
0.59
و
0.59
postérieure
0.58
z
0.57
م
0.57
POSITIVE LOGITS
under
1.13
neath
1.03
under
1.02
auspices
0.96
purview
0.90
Under
0.81
guise
0.78
unter
0.77
bajo
0.77
scrutiny
0.76
Activations Density 0.032%