INDEX
Explanations
mentions of rights or freedoms
expressions indicating personal rights and freedoms
New Auto-Interp
Negative Logits
unexpectedly
-0.77
asionally
-0.69
ktop
-0.68
newcomer
-0.62
Released
-0.62
icipated
-0.61
leased
-0.61
bitious
-0.61
Pebble
-0.60
Joined
-0.59
POSITIVE LOGITS
argument
1.26
analogy
1.21
explanation
1.19
reasoning
1.19
logic
1.14
argument
1.13
answer
1.06
definitions
1.04
definition
1.04
arguments
1.03
Activations Density 0.829%