INDEX
Explanations
positive descriptions or comments about events or objects
New Auto-Interp
Negative Logits
sealing
-0.73
streng
-0.71
dispers
-0.71
ingred
-0.69
rift
-0.65
confir
-0.65
eta
-0.63
sufficient
-0.63
arrang
-0.63
Secondly
-0.62
POSITIVE LOGITS
cringe
0.91
cliché
0.91
pundits
0.88
wondering
0.83
understandably
0.83
clich
0.81
familiar
0.81
rightly
0.78
tempting
0.77
headlines
0.76
Activations Density 3.478%