INDEX
Negative Logits
idols
0.79
اخرى
0.74
כל
0.73
recob
0.73
sakura
0.73
jhelp
0.73
indicadores
0.72
وج
0.71
QnrB
0.70
significativo
0.70
POSITIVE LOGITS
lick
0.81
舔
0.79
licking
0.77
lick
0.74
’
0.74
↵↵
0.70
↵↵↵
0.70
4
0.67
зо
0.66
6
0.64
Activations Density 0.004%