INDEX
Explanations
information about notable people, including their biographical details and accomplishments
New Auto-Interp
Negative Logits
rium
-0.80
acceptable
-0.77
rians
-0.73
react
-0.72
ria
-0.71
unanim
-0.70
enario
-0.70
aptic
-0.68
uers
-0.67
valid
-0.67
POSITIVE LOGITS
stint
1.29
Married
1.12
married
1.09
bachelor
1.03
befriend
1.01
graduated
1.01
lifelong
1.00
fluent
0.98
graduate
0.96
founding
0.95
Activations Density 9.276%