INDEX
Explanations
references to authenticity or realness in various contexts
New Auto-Interp
Negative Logits
iez
-0.15
ara
-0.15
ummies
-0.14
Og
-0.14
eno
-0.14
ummy
-0.14
еÑĤÑĭ
-0.14
unday
-0.14
loh
-0.14
rama
-0.14
POSITIVE LOGITS
actually
0.22
actual
0.20
actually
0.19
Actual
0.19
Actual
0.18
(actual
0.18
_actual
0.17
lia
0.17
actual
0.16
real
0.16
Activations Density 0.329%