INDEX
Explanations
references to institutions, educational contexts, and discussions of societal structures
New Auto-Interp
Negative Logits
kso
-0.53
olstein
-0.48
wayat
-0.46
Simplicity
-0.44
Gegens
-0.43
なかな
-0.43
scope
-0.43
gena
-0.43
blaze
-0.43
apathy
-0.42
POSITIVE LOGITS
rely
1.88
relies
1.82
depend
1.74
depended
1.69
reliance
1.63
dependence
1.60
reliant
1.54
relying
1.54
relied
1.53
depends
1.53
Activations Density 0.377%