INDEX
Explanations
adjectives and their variants
New Auto-Interp
Negative Logits
NSA
-0.82
employ
-0.79
REP
-0.70
MRI
-0.67
reports
-0.67
GV
-0.66
esson
-0.66
Collider
-0.66
NX
-0.64
paid
-0.64
POSITIVE LOGITS
paradise
0.90
footh
0.84
soils
0.83
slopes
0.83
greens
0.82
slope
0.82
plains
0.82
areas
0.82
substr
0.82
gardens
0.80
Activations Density 0.035%