INDEX
Explanations
sentences describing something subtle or minor
instances of mild or minor changes or differences
New Auto-Interp
Negative Logits
mberg
-0.82
CDC
-0.74
tested
-0.65
Ready
-0.64
Chronic
-0.64
ceans
-0.64
convened
-0.62
answered
-0.62
halla
-0.62
Kardash
-0.61
POSITIVE LOGITS
slight
0.88
exaggeration
0.82
modification
0.79
annoyance
0.78
nesses
0.76
resemblance
0.76
sway
0.74
rance
0.73
inclination
0.73
ihad
0.73
Activations Density 0.009%