INDEX
Explanations
adjectives related to physical attributes and states
vocabulary related to various conditions, characteristics, and qualities of individuals or situations
New Auto-Interp
Negative Logits
ju
-0.72
PN
-0.70
alyst
-0.68
Nar
-0.65
oji
-0.65
Adv
-0.63
sterdam
-0.63
ainer
-0.63
Stream
-0.61
Bomb
-0.61
POSITIVE LOGITS
tarian
0.83
isable
0.76
(<
0.72
setups
0.70
Swed
0.67
constructs
0.66
iberal
0.65
ened
0.65
shack
0.63
tabl
0.62
Activations Density 0.305%