INDEX
Explanations
animal intelligence and ethology
New Auto-Interp
Negative Logits
угла
0.41
neden
0.40
Woo
0.39
असली
0.39
arium
0.39
suplement
0.39
ooled
0.39
lined
0.39
Sales
0.38
avin
0.38
POSITIVE LOGITS
anarchist
0.51
anarch
0.48
abolition
0.45
autonom
0.44
comrades
0.43
compañ
0.42
communs
0.40
autonomy
0.39
刑
0.39
foragers
0.39
Activations Density 0.056%