INDEX
Explanations
occurrences of the word "justice" in various contexts
New Auto-Interp
Negative Logits
hing
-0.79
acid
-0.78
hetically
-0.75
ramer
-0.73
Dak
-0.73
ergy
-0.72
igslist
-0.72
spring
-0.71
ula
-0.70
urat
-0.70
POSITIVE LOGITS
justice
0.97
fulness
0.96
FUL
0.81
justice
0.80
lessness
0.75
SYSTEM
0.74
cellence
0.72
ĪĴ
0.72
injustice
0.71
fully
0.71
Activations Density 0.026%