INDEX
Explanations
expressions related to fear of consequences or repercussions in various contexts
risks or negative outcomes
risk and harm
New Auto-Interp
Negative Logits
<bos>
-0.63
autorytatywna
-0.56
HideFlags
-0.55
StatefulWidget
-0.54
HasBeenSet
-0.54
Климат
-0.53
fiées
-0.53
FunctionFlags
-0.52
TestBed
-0.52
gubern
-0.51
POSITIVE LOGITS
يتيمه
0.95
risk
0.85
risking
0.83
harm
0.83
jeopardi
0.80
jeopardize
0.78
risked
0.78
万一
0.76
potentially
0.76
lose
0.73
Activations Density 0.481%