INDEX
Explanations
violent or conflicting situations
words and phrases related to torment and related actions
New Auto-Interp
Negative Logits
enegger
-0.72
PRES
-0.67
Idle
-0.64
Consent
-0.64
McGill
-0.64
OPA
-0.64
éļ
-0.63
stakes
-0.62
DragonMagazine
-0.62
ynski
-0.62
POSITIVE LOGITS
mented
0.93
onto
0.87
onite
0.82
rid
0.81
vell
0.81
ritch
0.81
rums
0.78
orable
0.77
anz
0.77
ched
0.76
Activations Density 0.015%