INDEX
Explanations
phrases containing the special character 'Ċ'
expressions of dissatisfaction or criticism towards various aspects of society or behavior
New Auto-Interp
Negative Logits
occas
-0.95
eleph
-0.88
oun
-0.86
exting
-0.83
exha
-0.83
neighb
-0.82
tremend
-0.78
aditional
-0.78
newcom
-0.77
citiz
-0.75
POSITIVE LOGITS
Therefore
0.94
³³³³³³³³
0.91
³³³³³³³³³³³³³³³³
0.87
Furthermore
0.85
³³³³
0.84
Personally
0.84
Likewise
0.84
Learn
0.83
↵
0.81
Sadly
0.80
Activations Density 0.718%