INDEX
Explanations
concepts related to fairness and well-being in social and legal contexts
New Auto-Interp
Negative Logits
<eos>
-0.49
(
-0.47
-0.46
failed
-0.45
shows
-0.44
constat
-0.44
empat
-0.43
bek
-0.42
cs
-0.42
a
-0.41
POSITIVE LOGITS
ItemBackground
1.11
utafitiHapana
0.97
localctx
0.96
itſelf
0.96
0.90
脚注の使い方
0.88
Theſe
0.86
équilibr
0.86
myſelf
0.86
queryInterface
0.85
Activations Density 0.436%