INDEX
Explanations
the word "faire" or variations of it in the context of fairness, justice, or equality
terms related to fairness and justice
New Auto-Interp
Negative Logits
Dragonbound
-0.86
kson
-0.76
ITED
-0.73
Grill
-0.70
Barcl
-0.66
Records
-0.65
Presence
-0.65
Slug
-0.64
ebus
-0.64
Emer
-0.64
POSITIVE LOGITS
rative
1.26
rant
1.16
rance
0.99
rals
0.96
stration
0.96
ration
0.95
rants
0.93
anced
0.91
ral
0.91
rated
0.88
Activations Density 0.027%