INDEX
Explanations
abstract concepts or philosophical statements about belief and personal opinions
New Auto-Interp
Negative Logits
£¼
-0.19
ilos
-0.16
inear
-0.15
elas
-0.15
.pages
-0.15
inez
-0.15
kovÄĽ
-0.15
reon
-0.14
gros
-0.14
abay
-0.14
POSITIVE LOGITS
.maven
0.17
Compose
0.14
fl
0.14
ue
0.14
Ahead
0.14
.lp
0.14
(
0.14
Yaz
0.14
ib
0.14
usch
0.14
Activations Density 0.048%