INDEX
Explanations
references to political and legal concepts or proceedings
New Auto-Interp
Negative Logits
ggles
-0.64
partName
-0.61
urry
-0.59
»Ĵ
-0.56
vantage
-0.53
guyen
-0.52
obb
-0.51
idav
-0.51
gencies
-0.51
)\
-0.50
POSITIVE LOGITS
because
1.07
whereas
0.95
because
0.92
despite
0.89
.
0.82
owing
0.82
regardless
0.78
but
0.78
unfairly
0.78
although
0.77
Activations Density 1.411%