INDEX
Explanations
words related to arbitrariness or lack of order
terms related to arbitrary and unfair actions or decisions
New Auto-Interp
Negative Logits
oan
-0.88
nets
-0.86
iesel
-0.81
iosis
-0.81
oir
-0.80
rists
-0.80
iar
-0.79
è¦ļéĨĴ
-0.74
marks
-0.74
uries
-0.73
POSITIVE LOGITS
ety
0.82
jer
0.74
jerk
0.72
disob
0.69
confinement
0.65
duck
0.64
è¦
0.64
maj
0.64
Xie
0.62
tur
0.61
Activations Density 0.140%