INDEX
Explanations
phrases indicating value or worthiness
New Auto-Interp
Negative Logits
LookAnd
-0.78
gynhyrchwyd
-0.71
thentication
-0.68
Bielefeld
-0.67
Shakspeare
-0.67
Landau
-0.67
Delia
-0.67
AGC
-0.66
shewn
-0.65
felde
-0.65
POSITIVE LOGITS
worth
1.69
Worth
1.56
WORTH
1.53
Worth
1.43
worth
1.34
WORTH
1.32
worthwhile
0.99
Worthy
0.93
wort
0.88
vaut
0.83
Activations Density 0.044%