INDEX
Explanations
concepts related to openness and open relationships
New Auto-Interp
Negative Logits
Opening
-0.21
Opening
-0.20
opening
-0.20
opening
-0.20
opener
-0.19
itesse
-0.17
-opening
-0.17
gio
-0.16
gu
-0.16
cade
-0.16
POSITIVE LOGITS
-ended
0.42
-air
0.36
ended
0.33
ended
0.33
-door
0.30
Ended
0.30
ning
0.30
-source
0.29
Ended
0.28
eing
0.27
Activations Density 0.044%