INDEX
Explanations
phrases describing the condition and quality of objects
New Auto-Interp
Negative Logits
voks
-0.16
trá»ĭ
-0.14
atti
-0.14
Talent
-0.14
Rica
-0.14
ê³µìĭĿ
-0.13
áÅĻ
-0.13
rog
-0.13
aggress
-0.13
umph
-0.13
POSITIVE LOGITS
condition
0.78
Condition
0.66
condition
0.60
Condition
0.56
CONDITION
0.55
-condition
0.45
_condition
0.44
.condition
0.42
(condition
0.41
CONDITION
0.41
Activations Density 0.102%