INDEX
Explanations
words related to legal or justice-related concepts
terminology related to significant events or issues in societal contexts
New Auto-Interp
Negative Logits
impulse
-0.73
whine
-0.68
hump
-0.66
HOME
-0.65
obstruction
-0.64
choir
-0.61
slump
-0.60
avorite
-0.59
clerosis
-0.58
cube
-0.57
POSITIVE LOGITS
ented
1.11
marked
1.09
inged
1.07
oused
1.05
icating
1.02
itled
1.01
ared
1.01
athed
1.00
atered
0.99
osed
0.99
Activations Density 0.236%