INDEX
Explanations
words related to legal cases or court statements
New Auto-Interp
Negative Logits
bis
-0.74
LINE
-0.73
combust
-0.65
hypert
-0.65
Helsinki
-0.64
Dani
-0.63
ften
-0.60
Constantin
-0.60
Stockholm
-0.60
illuminating
-0.58
POSITIVE LOGITS
oin
1.00
ience
1.00
iences
0.98
eatures
0.92
din
0.92
iated
0.84
icates
0.84
alties
0.83
enos
0.82
iments
0.81
Activations Density 7.896%