INDEX
Explanations
references to research or organizational centers
New Auto-Interp
Negative Logits
🏻♀️
-0.87
UrlResolution
-0.80
hubarb
-0.76
Hogarth
-0.74
|
-0.74
McKinnon
-0.72
skiej
-0.72
soldier
-0.71
LEIA
-0.70
surgical
-0.70
POSITIVE LOGITS
centers
1.66
Centers
1.50
center
1.47
Center
1.47
CENTER
1.37
center
1.30
centres
1.30
centers
1.29
Center
1.26
CENTER
1.25
Activations Density 0.035%