INDEX
Explanations
phrases related to news events
references to legal terminology and discussions of laws or regulations
New Auto-Interp
Negative Logits
resy
-0.73
groups
-0.67
soType
-0.67
igmatic
-0.65
inarily
-0.63
Others
-0.62
glances
-0.62
fg
-0.61
viation
-0.61
nown
-0.61
POSITIVE LOGITS
Camel
0.69
Gw
0.67
Alexandria
0.65
Amelia
0.65
Moscow
0.63
Las
0.62
Nashville
0.62
Mer
0.61
Salman
0.61
Illum
0.60
Activations Density 0.432%