INDEX
Explanations
references to legal and political contexts
New Auto-Interp
Negative Logits
utenberg
-0.16
Gutenberg
-0.16
ailer
-0.15
arella
-0.14
_increment
-0.14
tô
-0.14
zorun
-0.14
buá»Ļc
-0.14
anners
-0.14
ruits
-0.14
POSITIVE LOGITS
pardon
0.43
pard
0.42
commute
0.34
commuting
0.32
comm
0.32
release
0.31
pard
0.31
Clem
0.30
parole
0.28
Release
0.28
Activations Density 0.069%