INDEX
Explanations
concepts related to societal improvement and progressive change
New Auto-Interp
Negative Logits
opa
-0.15
adow
-0.15
edom
-0.15
æŀĿ
-0.15
lump
-0.15
flow
-0.14
ledge
-0.14
avers
-0.14
óz
-0.13
split
-0.13
POSITIVE LOGITS
Contribution
0.20
contribution
0.19
203
0.19
-contrib
0.17
завÑĤÑĢа
0.17
contributing
0.17
tomorrow
0.17
contribute
0.16
Beitrag
0.16
contrib
0.16
Activations Density 0.027%