INDEX
Explanations
mentions of physical pain or anguish
terms related to pain or discomfort
New Auto-Interp
Negative Logits
ogue
-0.67
ODUCT
-0.66
DERR
-0.64
ship
-0.60
$$$$
-0.59
SpaceEngineers
-0.59
Jordanian
-0.59
Tsuk
-0.59
Jud
-0.59
Shepard
-0.59
POSITIVE LOGITS
phrine
1.00
rette
0.99
lla
0.97
tto
0.90
utic
0.88
ments
0.88
tti
0.84
tta
0.84
lette
0.81
ache
0.79
Activations Density 0.016%