INDEX
Explanations
actions related to preventing or thwarting negative outcomes
preventing or thwarting actions
New Auto-Interp
Negative Logits
endregion
-0.54
出版年
-0.52
intios
-0.47
kasarigan
-0.42
Salta
-0.40
Slf
-0.40
Tikang
-0.40
dataclass
-0.40
ngilizce
-0.40
transfieras
-0.39
POSITIVE LOGITS
prevented
0.65
verhindert
0.65
voorkomen
0.63
prevent
0.59
阻止
0.56
preventing
0.56
prevent
0.56
averted
0.56
Prevent
0.55
avert
0.54
Activations Density 0.035%