INDEX
Negative Logits
fence
0.42
vidrio
0.41
みんな
0.40
刪
0.39
เซ
0.39
skyscrapers
0.39
océano
0.39
tornado
0.39
озера
0.39
doulou
0.38
POSITIVE LOGITS
bespoke
0.51
standardised
0.47
whilst
0.46
adverts
0.45
Whilst
0.44
takeaway
0.44
sociable
0.44
predomin
0.43
ethos
0.43
postcode
0.42
Activations Density 0.003%