INDEX
Explanations
themes related to social issues and justice
New Auto-Interp
Negative Logits
á»Ń
-0.14
ä¸ĢåĮº
-0.13
lickr
-0.13
zan
-0.12
zÃŃ
-0.12
avatel
-0.11
æĭĽ
-0.11
æĭĽ
-0.11
ỡ
-0.11
ascus
-0.11
POSITIVE LOGITS
align
0.46
match
0.46
matches
0.44
coincide
0.44
coinc
0.43
correspond
0.41
matched
0.39
align
0.38
aligned
0.38
Align
0.37
Activations Density 0.461%