INDEX
Explanations
numeric or citation elements in the text
New Auto-Interp
Negative Logits
disillusion
-0.77
boredom
-0.75
exhaustion
-0.70
realism
-0.67
propri
-0.64
glimps
-0.63
Conf
-0.63
autonomy
-0.62
sarc
-0.62
hindsight
-0.62
POSITIVE LOGITS
bsite
0.77
ァ
0.76
ンジ
0.75
************
0.74
umerable
0.74
ة
0.72
ر
0.72
fty
0.72
ÍÍ
0.70
terday
0.69
Activations Density 0.164%