INDEX
Negative Logits
Helping
0.48
promoting
0.44
HELP
0.43
Specific
0.43
chiar
0.42
TSS
0.42
자체가
0.42
Helping
0.42
你也
0.41
Chelsea
0.41
POSITIVE LOGITS
however
0.68
však
0.68
όμως
0.68
പക്ഷേ
0.61
natomiast
0.61
gladly
0.60
però
0.58
azonban
0.57
viszont
0.56
却是
0.56
Activations Density 0.003%