INDEX
Explanations
concepts related to negative thoughts and positive affirmations
New Auto-Interp
Negative Logits
civ
-0.17
achen
-0.16
aga
-0.15
iec
-0.15
ima
-0.14
zel
-0.14
bourg
-0.14
lier
-0.14
chn
-0.14
trope
-0.13
POSITIVE LOGITS
visualization
0.17
Principle
0.17
Visualization
0.17
visualization
0.17
внÑĥ
0.16
congr
0.16
Attr
0.15
oppel
0.15
Success
0.15
Winners
0.15
Activations Density 0.131%