INDEX
Explanations
references to specific legal sections
references to specific legal sections and articles
New Auto-Interp
Negative Logits
destro
-0.69
bush
-0.62
tails
-0.61
tail
-0.60
projects
-0.60
robe
-0.59
Markus
-0.57
Russ
-0.56
anguages
-0.56
favourite
-0.56
POSITIVE LOGITS
VIII
0.94
VII
0.89
III
0.83
1070
0.83
XV
0.82
XI
0.81
501
0.81
702
0.80
1861
0.80
§§
0.79
Activations Density 0.050%