INDEX
Explanations
words indicating negative sentiment or claims
New Auto-Interp
Negative Logits
Strategies
0.98
Folders
0.90
Når
0.90
strategies
0.89
Different
0.88
.
0.87
Matrices
0.86
എ
0.86
multiplicación
0.85
algèbre
0.84
POSITIVE LOGITS
huge
1.36
pissed
1.35
HUGE
1.34
suspiciously
1.32
dubious
1.30
THEIR
1.25
basically
1.24
shitty
1.24
worthless
1.23
dodgy
1.23
Activations Density 0.035%