INDEX
Explanations
references to things being arbitrary or done without a specific reason
instances of the word "arbitrary" and its related forms
New Auto-Interp
Negative Logits
oir
-0.88
icans
-0.88
iosis
-0.86
ership
-0.86
iao
-0.86
iere
-0.79
amen
-0.79
ieri
-0.78
lain
-0.77
tha
-0.76
POSITIVE LOGITS
guiActiveUn
0.94
whims
0.91
arbitrary
0.86
judicial
0.74
recomp
0.73
whim
0.71
extr
0.71
confinement
0.70
drift
0.70
pret
0.69
Activations Density 0.020%