INDEX
Explanations
phrases related to legal proceedings and documentation
New Auto-Interp
Negative Logits
seless
-0.78
plex
-0.77
rex
-0.74
rox
-0.71
gal
-0.69
ulo
-0.68
gypt
-0.68
itals
-0.66
grad
-0.66
itia
-0.66
POSITIVE LOGITS
THING
1.28
WHERE
1.07
significant
0.92
ONE
0.89
meaningful
0.89
particular
0.88
place
0.87
ones
0.83
substantive
0.79
body
0.79
Activations Density 11.958%