INDEX
Explanations
names of people
references to the name "Jonathan."
New Auto-Interp
Negative Logits
ials
-0.75
iple
-0.71
¥µ
-0.67
isters
-0.66
housing
-0.65
huge
-0.62
ebin
-0.62
cycles
-0.61
agall
-0.61
ional
-0.61
POSITIVE LOGITS
athan
0.88
Cape
0.81
Gru
0.81
athon
0.80
Swift
0.80
Coul
0.79
Ernst
0.76
Cohn
0.76
Bernstein
0.74
Pearce
0.73
Activations Density 0.029%