INDEX
Negative Logits
soort
0.90
amount
0.84
amount
0.76
kind
0.76
!”
0.74
ताच
0.74
kinds
0.73
!”
0.73
happening
0.71
نوع
0.69
POSITIVE LOGITS
did
0.83
does
0.80
ceso
0.78
тэй
0.75
的事
0.74
ulfate
0.71
funcionan
0.71
funciona
0.71
old
0.71
kasar
0.71
Activations Density 0.176%