INDEX
Explanations
terms related to family relationships and dynamics
New Auto-Interp
Negative Logits
etta
-0.16
FUNC
-0.14
bond
-0.14
frei
-0.14
ALT
-0.14
alt
-0.14
ALAR
-0.14
سات
-0.14
Burg
-0.13
sher
-0.13
POSITIVE LOGITS
abal
0.15
ellen
0.15
olis
0.15
ibold
0.14
feb
0.14
otte
0.13
opsy
0.13
ocas
0.13
¢
0.13
Drain
0.12
Activations Density 0.006%