INDEX
Explanations
positive in contrast or ironically
New Auto-Interp
Negative Logits
끝
0.38
మైసూరు
0.38
nurse
0.37
givenChar
0.37
nurse
0.35
Mead
0.35
placé
0.35
两侧
0.35
Quick
0.35
phenotypic
0.35
POSITIVE LOGITS
Positive
0.78
positive
0.72
positive
0.66
positiva
0.65
positivas
0.64
Positive
0.63
positifs
0.62
pozitiv
0.61
POSITIVE
0.61
positivos
0.59
Activations Density 0.009%