INDEX
Explanations
references to dismissive or trivializing language regarding serious topics
New Auto-Interp
Negative Logits
yntaxException
-0.62
secr
-0.61
devamını
-0.59
writerow
-0.58
ChromeDriver
-0.53
conflicto
-0.53
cuire
-0.53
FormState
-0.52
jelder
-0.52
mutlich
-0.52
POSITIVE LOGITS
trivial
1.02
frivolous
0.93
frivol
0.90
trivi
0.86
casually
0.84
trivial
0.82
cavalier
0.78
trifling
0.78
nonchal
0.75
unimportant
0.74
Activations Density 0.461%