INDEX
Explanations
the word "low" and variations of it in the text
references to minimal or low-quality characteristics
New Auto-Interp
Negative Logits
andise
-0.79
tnc
-0.76
Chaser
-0.67
iott
-0.67
ADRA
-0.66
Ashe
-0.65
natureconservancy
-0.63
Disorder
-0.61
greg
-0.61
itutional
-0.60
POSITIVE LOGITS
brow
1.00
ered
0.97
ball
0.88
est
0.87
enthal
0.87
hanging
0.84
down
0.82
profile
0.80
profile
0.79
enstein
0.79
Activations Density 0.044%