INDEX
Explanations
discussions about fears and concerns related to health and safety
New Auto-Interp
Negative Logits
FormTagHelper
-0.89
"])
-0.81
__))
-0.79
']],
-0.76
SourceChecksum
-0.75
')")
-0.74
TestingModule
-0.73
']]
-0.73
>"+
-0.72
}]
-0.72
POSITIVE LOGITS
fucking
0.92
stuff
0.91
shitty
0.80
FUCKING
0.80
goddamn
0.77
freaking
0.77
dudes
0.74
weird
0.74
mierda
0.73
badass
0.73
Activations Density 0.472%