INDEX
Negative Logits
s
0.57
0.50
percent
0.44
percentage
0.44
rules
0.43
0.43
code
0.42
council
0.42
pinc
0.42
Jo
0.41
POSITIVE LOGITS
癶
0.61
ฝึก
0.57
dissati
0.56
βοη
0.56
0.56
ងឺ
0.55
<unused1869>
0.55
<unused1825>
0.54
sistemat
0.54
学習
0.54
Activations Density 0.002%