INDEX
Explanations
phrases that express capabilities or potential
New Auto-Interp
Negative Logits
positifs
-0.70
StatelessWidget
-0.68
rashed
-0.67
utnik
-0.67
Fros
-0.66
eningrad
-0.65
principalColumn
-0.64
Martens
-0.63
suspicions
-0.63
Kirs
-0.62
POSITIVE LOGITS
abilities
1.52
ability
1.52
Ability
1.43
Ability
1.32
Abilities
1.31
capability
1.22
Abilities
1.17
capabilities
1.13
Able
1.10
able
1.07
Activations Density 0.082%