INDEX
Negative Logits
ză
0.72
suppresses
0.70
orthogonality
0.68
puppy
0.67
plumbers
0.66
actuation
0.65
娀
0.64
attendant
0.64
TouchEvent
0.64
développe
0.63
POSITIVE LOGITS
وفي
0.92
IN
0.91
爾
0.90
ﺭ
0.89
ﻭ
0.84
٬
0.78
D
0.77
瓏
0.77
ﺩ
0.75
savor
0.74
Activations Density 0.004%