INDEX
Explanations
concepts related to freedom or the state of being free
New Auto-Interp
Negative Logits
oscope
-0.74
Saban
-0.73
Bog
-0.66
Eug
-0.63
need
-0.63
ents
-0.62
ENTS
-0.62
Ange
-0.62
amel
-0.62
Tie
-0.61
POSITIVE LOGITS
roam
0.85
bies
0.85
boot
0.84
spirited
0.73
zing
0.72
born
0.71
indemn
0.71
icient
0.70
bern
0.70
zes
0.70
Activations Density 0.014%