INDEX
Explanations
sexual health, testing, STD
New Auto-Interp
Negative Logits
sak
0.53
konsep
0.53
koordin
0.52
peng
0.51
kat
0.51
akademik
0.51
REIT
0.50
kong
0.50
i
0.49
nak
0.49
POSITIVE LOGITS
surpassed
0.46
lingered
0.45
మారింది
0.45
tattooed
0.42
succeeded
0.42
পরাজিত
0.41
illes
0.41
آیات
0.40
profoundly
0.40
தலை
0.39
Activations Density 0.002%