INDEX
Explanations
references to freedom and related rights or protections
New Auto-Interp
Negative Logits
antry
-0.15
Kültür
-0.15
imen
-0.15
ØŃاضر
-0.15
#!
-0.15
longleftrightarrow
-0.14
ester
-0.14
urban
-0.14
'];?>
-0.14
ocket
-0.13
POSITIVE LOGITS
fighters
0.24
Fighters
0.23
fighter
0.22
Fighter
0.19
ibold
0.18
/lib
0.18
fighters
0.18
Freedom
0.17
Freedom
0.17
loving
0.17
Activations Density 0.018%