INDEX
Explanations
references to significant cases or symbols related to political and social issues
New Auto-Interp
Negative Logits
enge
-0.14
ظ
-0.14
avir
-0.14
aire
-0.14
eland
-0.14
AccessException
-0.14
ium
-0.13
idor
-0.13
:not
-0.13
.super
-0.13
POSITIVE LOGITS
rallying
0.24
shorthand
0.23
tal
0.22
orthand
0.20
sort
0.19
icon
0.19
syn
0.19
touch
0.18
icon
0.18
catch
0.18
Activations Density 0.050%