INDEX
Explanations
instances of the word "less" and its variations to identify themes of absence or deprivation
New Auto-Interp
Negative Logits
.Compute
-0.16
볬
-0.15
atz
-0.15
ongyang
-0.15
ansa
-0.15
negoci
-0.14
AndGet
-0.14
jac
-0.14
abe
-0.14
ладÑĥ
-0.14
POSITIVE LOGITS
0.16
vÄĽd
0.15
berman
0.15
nes
0.15
Lind
0.15
ime
0.14
ços
0.14
ife
0.14
esen
0.14
isol
0.14
Activations Density 0.017%