INDEX
Explanations
numbers and measurements that indicate loss or damage
occurrences of the word "loss" and its variations indicating various forms of loss
New Auto-Interp
Negative Logits
ansky
-0.71
ENTS
-0.70
ECK
-0.69
ATT
-0.64
ricks
-0.62
inventive
-0.60
abad
-0.60
kov
-0.60
raper
-0.59
Caucas
-0.59
POSITIVE LOGITS
aversion
0.93
iem
0.85
incurred
0.85
Loss
0.84
loss
0.81
loss
0.75
luster
0.75
esville
0.74
front
0.73
lessly
0.72
Activations Density 0.023%