INDEX
Explanations
words related to providing feedback or criticism with the intention of improvement
terms related to constructive feedback and wilderness conservation
New Auto-Interp
Negative Logits
plex
-0.76
ighty
-0.69
gdala
-0.67
achy
-0.66
eph
-0.65
aeda
-0.65
bus
-0.63
clipse
-0.63
igans
-0.63
gary
-0.63
POSITIVE LOGITS
=]
0.88
ÃĥÃĤ
0.79
^^^^
0.77
Liberties
0.76
ß
0.74
istered
0.73
////////
0.72
IZE
0.68
estern
0.66
zes
0.66
Activations Density 0.024%