INDEX
Explanations
references to personal development and self-identity
New Auto-Interp
Negative Logits
èĴĻ
-0.15
AMY
-0.14
odo
-0.14
empo
-0.14
ufen
-0.14
ewise
-0.14
Ø·ÙĦÙĤ
-0.13
FTA
-0.13
evi
-0.13
gree
-0.13
POSITIVE LOGITS
partial
0.34
into
0.28
Partial
0.27
fond
0.25
known
0.24
partial
0.24
Partial
0.24
INTO
0.23
.into
0.22
Into
0.21
Activations Density 0.212%