INDEX
Explanations
expressions of worthiness or value
New Auto-Interp
Negative Logits
Frazier
-0.71
Zig
-0.70
eteria
-0.69
gdala
-0.67
ulated
-0.65
Leilan
-0.64
acid
-0.63
ole
-0.62
esville
-0.61
ulating
-0.61
POSITIVE LOGITS
minded
0.84
applause
0.80
consideration
0.78
deserving
0.77
contenders
0.75
ãģĨ
0.73
aspirations
0.71
worthy
0.71
OGR
0.71
successor
0.71
Activations Density 0.007%