INDEX
Explanations
words related to cognitive abilities and skills
New Auto-Interp
Negative Logits
Bride
-0.79
bart
-0.59
Schwe
-0.59
Destination
-0.58
gone
-0.58
Hanson
-0.58
algia
-0.58
Alonso
-0.57
Hein
-0.56
Simone
-0.56
POSITIVE LOGITS
to
0.91
Reviewer
0.86
ologies
0.86
ibility
0.80
bodied
0.78
itud
0.73
Ability
0.73
ibilities
0.72
assisted
0.72
ively
0.70
Activations Density 0.038%