INDEX
Explanations
terms related to consequences or impact
terms related to discussions and assessments of consequences and responsibilities in various contexts
New Auto-Interp
Negative Logits
poon
-0.70
!,
-0.63
.$
-0.63
dit
-0.62
!.
-0.61
+.
-0.60
cot
-0.59
isse
-0.58
intend
-0.56
iolet
-0.56
POSITIVE LOGITS
varies
1.02
remains
0.86
arises
0.83
differs
0.83
becomes
0.79
is
0.77
depends
0.76
isn
0.75
seems
0.75
hasn
0.74
Activations Density 0.377%