INDEX
Explanations
words related to critical evaluation or negativity
words related to blunders and mistakes
New Auto-Interp
Negative Logits
HER
-0.83
Construct
-0.74
OLOGY
-0.71
geist
-0.70
UAL
-0.69
pour
-0.68
OTOS
-0.68
lectic
-0.68
EMENT
-0.67
RAL
-0.67
POSITIVE LOGITS
anches
1.08
ameless
1.07
anche
1.07
ossom
1.03
ogging
0.98
ocking
0.98
itting
0.97
ushes
0.97
itter
0.96
uffs
0.96
Activations Density 0.009%