INDEX
Explanations
expressions of personal opinions and judgments
New Auto-Interp
Negative Logits
ertino
-0.16
Click
-0.15
click
-0.14
avour
-0.14
Despite
-0.14
CLICK
-0.14
ataka
-0.14
.this
-0.13
-this
-0.13
Ïģκ
-0.13
POSITIVE LOGITS
Plus
0.25
Plus
0.25
plus
0.24
plus
0.24
PLUS
0.23
.plus
0.20
ETA
0.20
EDIT
0.19
EDIT
0.19
PLUS
0.18
Activations Density 0.475%