INDEX
Explanations
expressions related to physical or emotional distress and discomfort
words related to negative experiences or conditions
New Auto-Interp
Negative Logits
IFT
-0.67
Breach
-0.63
ogical
-0.62
happening
-0.59
empt
-0.58
çĶ
-0.58
)].
-0.58
Bring
-0.58
Line
-0.57
Infinite
-0.57
POSITIVE LOGITS
aned
4.19
eded
1.04
anol
1.03
ane
0.95
bered
0.95
aved
0.93
aired
0.91
anes
0.91
anism
0.86
anan
0.84
Activations Density 0.018%