INDEX
Explanations
names of specific locations or individuals
variations of the word "able."
New Auto-Interp
Negative Logits
ORGE
-0.86
eve
-0.84
xon
-0.74
IDA
-0.72
meal
-0.69
Catalyst
-0.67
lli
-0.66
Mour
-0.66
women
-0.65
birth
-0.64
POSITIVE LOGITS
anca
1.20
ocal
0.99
abl
0.98
eness
0.96
osure
0.92
anches
0.90
ingu
0.89
acas
0.82
anty
0.82
ocaly
0.80
Activations Density 0.009%